ZX)CUMENT RESUME 

ED 338 691 TM 017 520 



AUTHOR 
TITLE 

INSTITUTION 

SPONS AGENCY 
PUB DATE 
NOTE 

PUB TYPE 



Dorr-Bremme, Donald W.; Herman, Joan L. 
Assessing Student Achievement; A Profile of Classroom 
Practices. CSE Monograpn Series in Evaluation ll. 
California Uni;., Los Angeles. Center for the Study 
of Evaluation. 

National Inst, of Education (ED), Washington, DC. 
86 

132p. 

Statistical Data (110) — Reports - 
Research/Technical (143) 



EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



MF01/PC06 Plus Postage. 

^Academic Achievement; Classroom Techniques; 
Educational Assessment; ^Elementary School Teachers; 
Elementary Secondary Education; Evaluation 
Utilization; Instructional Leadership; National 
Surveys; «Prlncipals; "Secondary School Teachers; 
"Student Evaluation; Tables (Data) ; Teacher Made 
Tests; Testing Programs; *Test Use 
Testing Effects; *Te3t Use in Schools Study 



ABSTRACT 

The center for Evaluation Study at the University of 
California (Los Angeles) conducted the Test Use in Schools study to 
map the topography of basic-skills achievement testing and 
achievement test use in public schools across the United States- The 
survey addressed a nationwide sample of principals and teachers 
through a successive, random-selection procedure, and focused on 
determining how leadership activities Impact test use. Responses were 
obtained from 220 principals, 475 elementary school teachers, and 363 
high school teachers in 91 of the 114 districts sampled. Return rates 
were about 50% for high school teachers, abcut 60% for principals, 
and about 60% for elementary school teachers. Examination of the 
survey responses confirms that there are two tiers of student 
achievement assessment in the country. One tier is internal or local 
to the schools; it is "owned" and usually produced by the teachers 
themselves as tests and other assessments of achievement in the 
classroom by the irdlvldual student. The other tier of assessment is 
external to the school; it is mandated by district, state, or federal 
program requirements. Botn tiers are underutilized; neither fulfills 
its potential for helping students. Case studies and research-based 
models are included to illustrate how schools and districts can 
integrate these tiers to plan fcr instructional improvement. There 
are 31 tables of study data and 5 figures illustrating the concepts 
discussed. A 50-ltem list of references is included. (SLD) 



******************************************************* *********** 

* Reproductions supplied by EDRS are the best that c< be made 

* f'-oin the original document. 




U S OC^ARTMCNT 09^ EDUCATION 

OHice of Educ«t<onil RHMarch and improvemfli^t 

EDUCATIONAL Ht SOU^^CHS INFORMATION 

y CBNTFHtERlC) 

tHrXhtt document h«i been reproduced i6 
rece>v9d from Ihe perton Of orQAnizehOn 
ofigtneiifig ii 

O Mtnor changes hive beon made to improve 
reproduction quality 




e Points of vtev^ or opmiom f lated in tni« docu 
ment do not n«co9sariiy fopresent otficiii 
OE Rl position or poUcy 



•.<v-. 




"PERMISSION TO REPRODUCE- THIS 
MATERIAL HAS BEEN GRANTED BY 



/mm 




r^M| ■ ; INFORMATION CENTSR (ERIC)." 



i4r 



TO THE EDUCATIONAL RESOURCES 






0M 
PI 

pip 




CSE MONOGRAPH SFRIES 
IN EVALUATION 



SERIES EDITOR 
Eva L. Baker 



Center tor the Study of Evaluation 
UCLA Graduate Sthix)! ot Education 
University of California. Los Angeles 
Los Angeles, California 9{K)24 



ASSESvSlNG STUDENT ACHIEVEMENT: 
A PROFILE OF CLASSROOM PRACTICES 



Authored by 
Donald W. Dorr-Bremme 

and 

Joan L. Herman 



iii 



CSE MONOGRAPH SERIES IN EVALUATION 



NUMBER 

1 . Domain-Relerenced Cuniculum Evaluation: ATechnical HandUwk and a 
Case Study from the MINNEMAST Project 

Wells Hively, Graham Maxwell. George Rabehl, Donald Scnsion. and 
Stephen Lundin 

I. National Priorities for Elementary Education 

Ralph Hoepfncr, Paul A. Bradley, and William J, Doherty 

3. Problems in Criterion-Referenced Measurement 

Chester W. Harris. Marvin C. Alkin. and W, James Popham (Editors) 

4. Evaluation and Decision Making: The Title VII Experience 

Marvin C Alkin, Jacqueline Kosecoff* Carol Fit/-Gibbon, and Richard 
Seligman 

5. Evaluation Study of the Californiii Slate Preschool Program 
Ralph Hoepfner and Arlene Fink 

6. Achievement Test Items — Methods of Study 

Chester W. Harris » Andrea Pastorok Pearlman, and Rand R. Wilcox 

7. The Logic of Evaluative Argument 
Ernest R. Housp 

8. Tr- A Methodology of Naturalistic Inquiry in Eiducational Evaluation 
Egon G. Guba 

9. Values. Inquiry, and Education 

Hendrik D. Gideonse. Robert Koff, and Joseph J. Schwab (lidilors) 

10. Evaluation in School Districts: Organizational Perspectives 

Adrianne Bank and Richard C. Williams with James Burry (Editors) 

I I. Assessing Student Achievement: A Profile of Classroom Practices 
Donald W. Dorr-Bremme and Joan L. Herman 

This Project was supported in whole or iti part hy the National Institute of Education. Department 
ot Education However, the opinions expressed herein do not necessarily reflect the position or 
policy of iho National Institute ot Bducaiion. and no ollicial endorsemeni by the National Institute 
ol hducation shv>ulU he ml erred. 



ERIC 



5 

iv 



TABLE OF CONTENTS 



Acknowledgements vii 

Chapter 1 Introduction 1 

Chapter 2 Assessing Student Achievement: The Frequency 

of Testing and the Time it Takes 15 

Chapter 3 Using Assessment Results 29 

Chapter 4 Administrative leadership: Monitoring and 

Supporting Assessment 45 

Chapter 5 Principals' and Teachers' Perceptioiis and 

Beliefs About Testing 57 

Chapter 6 The School Context and Classroom 

Testing Practices 79 

Chapter 7 Summary and Implications: Issues for State 

and National Policy Makers 93 

Chapter 8 Directions For Policy and Practice at the 
Local Level: Linking Testing With 
Instructional Planning and Improvement 109 



ERLC 



v 



ACKNOWLEDGEMENTS 



This monograph is the result of a five-year prt)ject CH)nUuctecl by the 
Center for the Study of Evaluation with funding from the National 
Institute of Education (NIE). The project would not have been possible 
without the great contributions of many individuals: 

First and foremost, our thanks to Eva Baker, Director of the Center 
who made the project possible. Dr. Baker suggested the need for the 
project, helped guide its initial design, and contributed her ideas, 
enthusiasm, and support throughout the entire project. 

Our thanks also to the many CSE staff members who participated 
with us: Jennie Yeh, who co-directed the study during its first critical 
year: Bruce Choppin and William Doherty, who helped to direct its 
data analysis; James Burry, who contributed significantly to every 
phase of the project: Linda Polin, Charlotte La/ar-Morrison and 
Raymond Moy who ably conducted the literature review: Beverly 
Cabello and Li/a Daniels, who provided critical assistance in the 
design and implementation of the field study and testing costs; James 
Lehman and Chih-Ping Chou who competently organized the data and 
responded to our innumerable requests for new analyses: and to Aeri 
Lee and Katharine Fry who patiently and ably managed the production 
of this manuscript. 

We are grateful also to Lawrence Rudner, our NIE monitor, who 
continuously encouraged our work and to the Institute itself for the 
support it provided. 

Finally and especially, we would like to thank the district 
administrators, school principals, teachers and others in the schools 
who cooperated in the study. We greatly appreciate the time they 
donated to us. filling out our questionnaires, responding to our 
questions, and sharing their perceptions with us. 

To all who participated with us and made our work possibl , we 
extend our sincere appreciation for their important contributions. 

Donald W. Dorr-Brenimc 
Joan L. Herman 



VII / 



CHAPTER 1 
INTRODUCTION 



Fueled by school board accountability concerns, minimum compe- 
tency mandates, evaluation requirements for federal, state and local 
programs, and ihe growth of curriculum-embedded and continuum- 
based assessment systems, achievement testing in American schools 
has become both an enterprise of significant scope and viMbility and the 
subject of considerable public discussion and debate. C/itics have at- 
tacked the arbitrariness of current testing practices (Baker, 1978), have 
expressed concerns about their validity and bias (Perrone, 1978), have 
accused testing of narrowing the curriculum and have questioned the 
value of traditional testing amidst changing functions of education 
(Tyler, 1977). The quality of available tests continues to be controver- 
sial (CSE, 1979; The Huron Institute, 1978), at least one major teach- 
ers' organization called for a moratorium on the use of standardized 
tests, and vigorous legal battles have been launched. 

Responding to these various challenges, adovcates of testing have 
reaffirmed its importance and reasserted the variety of purposes thai 
current tests can and do serve. Supporters have maintained, for exam- 
ple, that testing promotes accountability, facilitates more accurate 
placement and selection decisions, and yields information useful for 
cunicular and instructional improvement. 

The testing controversy rages on while the nation's considerable 
investment in achievement testing continues. Although the stakes in the 
debate are high, public policy in this arena has been formulated without 
the benefit of basic information about the nature of testing as it actually 
occurs and is used in schools. How much testing really goes on? How 
are test results used? What functions do tests serve for teachers and 
principals? What are the effects on schools of various local, slate and 
federal manadates? These and similar questions have gone largely 
unaddressed. A few studies have indicated teachers* reservations about 
the limited use of one type of achievement measure — the norm- 
referenced standardized test (Airasian, 1979; Boyd et al, 1975; Goslin, 
1965; Goslin, Epstein, & Hallock, 1965; Rcsnick, 1981 : Salmon-Cox. 
1981; Stetz & Beck, 1979). Beyond this, however the landscape of 
testing practices and lest use in American schools rcn^ains largely 
unexplored. 



1 

8 



0 



Introduction 



In this context, the UCLA Center for the Study orHvaluation's (CSE) 
three-year study provides educational policy-makers with basic, new 
intbrnialion on classroom achievement tcstiiig across the United States. 
Conducted from 1979 through 1983, CSE\s research was designed to 
take a comprehensive picture of national testing practices. It investigat- 
ed a wide range of types of formal assessment measures (e.g., commer- 
cially produced norm- and criterion-referenced tests and curriculum 
embedded measures, tests of minimum competency and functional 
literacy; district-, school-, and tcacher-develojicd tests) as well as some 
less formal means for gauging student progress and achievement (teach- 
ers* observations of and interactions with learners). Within this broad 
range, inquiry focused on achievement testing practices in reading/ 
English and in mathematics, basic skills areas which are the subject of 
continuing public concern. Teachers and principals at both elementary 
and secondary grade levels served as primary subjects for the nation- 
wide survey, addressing those grade levels which had been identified in 
prior research as important transition points and the targets of frequent 
testing. The research commenced with an extensive literature review 
and exploratory fieldwork in three school districts acrtvss the country to 
identify relevant contextual variables and to deepen our understanding 
of teachers' and principals' orientations. Case study inquiry following 
the survey explored in greater detail issues a.ssociated with the costs of 
testing. 

Policy Orientation; Questions and Issues of Interest 

As the discussion above suggests, educational achievement testing is 
a pervasive enterprise, one which recurrently affects the lives of all 
students. It is an enterprise which is rapidly changing, diversifying and 
expanding. And it is an enterprise in which hundreds of millions of 
do^^'irs in public monies are expended annually. It is nut surprising, 
then, that it generates a broad range of questions and issues for 
policymakers to address. The CSE study examined a number of these: 

Competency testing. Across the naticm, more than 40 states have now- 
mandated tests of minimum ccunpetency for school children. Some 
states require such tests for promotion and graduation; others for check- 
ing students' basic educational needs at milestones in their s'^^ool 
careers. Decisionmakers at all levels need lo know how these testing 
programs are inlluencing students' educational experiences and life 
chances What are the impacts of different kinds of minimum compe- 
tency programs'.^ Have they affected curriculum and instruction? Have 
they wrought changes in the either uays districts and schools measure 
students' progress? 



iNTKODUrnON 



3 



Testing for federal and state pro}iram evalution. FedtM al and state 
categorical programs, meanwhile, continue to include evaluation re- 
quirements. Testing student achievement remains a primary way of 
meeting those requirements. Program administrators and technical as- 
sistance personnel in both funding agencies and participating districts, 
along with legislators and their advisors, need cost benefit mformation 
on testing in this context. Can it and does it serve purposes beyond 
accountability and compliance? How docs testing for federal and state 
program evaluation affect the instructional time of participating stu- 
dents? How does it influence the distribution of instructional staff 
members* energies and efforts? 

District continuum testing. Simultaneously to the above activities, 
many school districts are expanding their own testing programs. And 
increasingly these district tests monitor students' progress along dis- 
trict-mandated sequences (or continua) of skills or objectives. From 
district to district, however, teachers may differ in their willingness to 
administer such tests and to utilize the results. Under what conditions, 
then, are test accompanying skills continua most likely to be adminis- 
tered and used in instructing students? What qualities should the tests 
have to be maximally useful? How can they be effectively integrated 
with other assessment activities? District administrators require infor- 
mation to resolve these issues. 

Teacher-constructed tests and other assessment techniques. Teachers 
themselves seem to spend significant amounts of their assessment time 
in administering tests and quizzes that they construct. They al.so seem to 
devote considerable attention, especially in the elementary grades, to 
commerically produced tests that come with curriculum materials. 
What are the qualities in these kinds of tests that make them attractive 
and useful? 

Defining the Research Problem 

Given the vast array of policy issues and information needs surround- 
ing educational testing, how should a national student survey be fo- 
cused? CSB's Test Use Survey was guided by two interrelated concepts: 

— the concept of the teacher as practical reasoner and 
decisionmaker; 

the concept of testing as an intervention 
The teacher as prattical reasoner and decisionmaken The view ot 
teachers as practical reasoners and decisionmakers emerues from theory 
and research from *he branch of sociology known as ethnomethodologv 
(Cicourel, 1974; Garfinkel, 1967; Cicourel, & Kitsuse, 1963; Mehan& 



10 



4 



iNlKOUrCIION 



Wood, 1975; Wcidcr, 1973; Wood, 1^68). According to this view, as 
practical reasoncrs and practical dccisionniakcrs, members of social 
units: 

— Orient their activities ;o the practical tasks they must accomplish 
in their everyday routines and do so in light of the practical 
contingencies and exigencies {hey face; 

— Carry out their activities based on their ^'background under- 
standings" of a ''world known in common and taken for grant- 
ed" (Schui/, 1962). That world is validated and supported daily 
through members' collective activities. Members act as ''naive 
phenomenologists," taking things as they seem to be until u-:- 
folding experience proves them otherwise. Thus they sustain 
their orientations to their practical tasks and circumstances. 

Data from the Test Use in School Study's planning-stage fieldwork 
efforts support such a view. That teachers do orient their efforts to the 
practical tasks that are demonstrably central in their everyday profes- 
sional lives and do orient to the practical exigencies they face was 
recurrently documented. Teachers, for example, reported their uses of 
test results as serving most heavily the functions that are central to their 
routine teaching responsibilities: deciding what to teach and how to 
teach it to students of different achievement levels; keeping track of 
how students are progressing; and evaluating and grading students on 
their performance (Dorr-Bremme, 1983). Further, the means of assess- 
ment that teachers reported using most often and in the greatest variety 
of ways were those which facilitate the accomplishment of their practi- 
cal activities and respond to the practical exigencies they face. 

A variety of routine tasks constitute the world of teaching as prac- 
ticed. Teachers must accomplish these tasks in a context characterized 
by recurrent time limits, others' demands for high performance and 
accountability, and their own concerns with providing effective and 
appropriate instruction. These features of the teaching world impinge 
upon teachers' testing practices and test use. Thus, it appears that their 
reasoning and decision-making about asssessment and its uses are 
structurcf* oy and oriented to their practical circumstances. 

Testinfi as an imervention. A second concept framing the Test Use in 
Schools Survey was the concept of testing as an intervention. From this 
perspective, required or reconmiended tests, by virtue of their very 
presence in .schools can impact educational practices. They can, in fact, 
timet ion as change agents. Supporting this point of view, planning stage 
research indicated that: 

I . Mandated tests can add new standards of accountability to those 
that teachers must attend to in their everyday routines. Reasoning 




Introduction 



5 



practically, teachers may feel responsible for adjusting their 
instructional emphases and techniques to match the skills and informa- 
tion students must master to do well on required tests, For example, 
minimum competency tests, particularly those required for graduation, 
seem especially likely to re-orient teachers' practical reasoning and 
instructional planning and induce them, individually and schoolwide, 
to alter curriculum and teaching methods. 

2. Mandated tests can change the practical circumstances under 
which teaching and learning must be accomplished. Resix)ndcnts in e 
exploratory field research, for instance, cited a number of unintended, 
largely negative, effects of testing programs, e.g, reduction in time for 
teaching. Where consequences of this type (Kcur, they alter the practi- 
cal contingencies that teachers face in accomplishing their routine 
activities. As they do, they may occasion broader changes in 
instructional practices, curriculum, and perhaps in students* learning as 
well. 

3. Mandated tests, where they respond to teachers practical exigen- 
cies, can provide new ways to accomplish routine tasks and can signal 
new approaches to instructional practice. Fieldwork in twodi.stricts, for 
example, illustrated the ways in which a district continuum test can 
respond to teachers* assessment needs and facilitate more individual- 
i/.ed instructional approaches. Under such circumstances, testing pro- 
grams of particular kinds can serve as agents for educational change. 

Framework for the National vSurvey 

The two related concepts of the teacher as a practical reasoner and 
testing as an intervention provided a u.seful organizing framework for 
the national survey of assessment practices and uses schools and class- 
rooms. In addition to informing the selection of domains to be exam- 
ined in survey questionnaires, this framework indicated some interest- 
ing relationships to be explored. These domains and hypothetical rela- 
tionships are displayed in Figure 1 . (Notice that not all the relationships 
portrayed there were examined in the national survey.) 

FvilerullstateHociil testing requirements. Attention to such roquire- 
menis responds to the concept of testing as an intervention. As depict- 
ed, testing requirements influence the distribution and frequency of 
types of testing at local sifes, and thus bear upon patterns of test use. 
(That is, districts may introduce innovative tests that teachers use 
heavily to replace self-constructed tests, etc. Federal and state evalua- 
tion requirements may encourage consolidation of assessment activi- 
ties and use of extant tesis for **ncw" purposes, or they may simply 



ERIC 



12 



6 



Introduction 



imroduce additional testing at local sites,) Following the chain of 
posited relationships further, testing interventions such as minimum 
competency programs may impact on the organization of curriculum 
and instruction (as described above). 

Given that types of assessment seem to impact on one another and 
^Mven the seeming importance of minimum competency testing as an 
agent of change, districts were sampled on presence/absence of 
statewide assessment and on various conditions of minimum competen- 
cy testing. Data on the federal-, state-, and district-initiated testing in 
sampled districts and schools were elicited in brief, initial, district- 
contact phone interviews with district testing officers anci through prin- 
cipal questionnaires. 

Federal/state/local programs. The presence/absence of particular 
federal and state categorical programs, and local educational programs 
as well, is assumed to influence how curriculum and instruction are 
organized in schools and, in turn, the routine tasks of local-site practi- 
tioners. (For instance. Title I and Title VII programs and programs 
developed in response to Public Law 94-142 occasion referral, place- 
ment, and diagnostic decisions,) The testing that occurs and the test 
scores that are used follow from needs inherent in these routine tasks. 
The study was not explicitly interested in studying how federal, state, 
or local programs impact on the organization of curriculum and instruc- 
tion locally (Jutted line, arrows). It was only interested in the 
presence/absence of the instructional alternatives such programs pro- 
vide. Thus only information on district and school participation in 
major, instruction-related federal and state programs, e,g.. Title 1, 
(Chapter 2) was gathered. 

Organi:cition of curriculum ami instruction. The organization of 
curriculum and instruction constitutes a main influence on the nature of 
teachers' routine, practical activities and decisions. If students are 
grouped by reading level or set to work in individualized, self-paced 
learning programs, the teachers need to make placement decisions. If a 
continuum of objectives or **management system'' is established then 
teachers must monitor learners' progress through that continuum. If 
team teaching is practical or aides are available for instructing students, 
students must be distributed to the instructional alternatives afforded by 
extra personnel (Yeh, 1978; Yeh, 1980). In summary, it was hypoth- 
esized that a greater variety and number of available instructional alter- 
natives in the classroom and school would increase the routine tasks and 
decisions that require assessment information, and so influence both 
the patterns of testing that occur locally and the ways test scores are 
used locally. 



ERIC 13 



In iKDUucnoN 



7 



Daia on the organization orcurriculum and inslruclion were gathered 
primarily on teacher questionnaires: e.g., the presence/absence of aides 
and team teaching, the ways teachers distribute students f jr instruction 
within the class » presence and type of instructional support services 
beyond the classroom. Information on the latter was also elicited from 
principals. 

Types of students served. The nature of practitioners' routine, practi- 
cal activities and decisions was assumed to vary with the types of 
students enrolled in the school and assigned to a teacher's classroom. 
Students whose first language is not English^ who are members of 
socioeconomically depressed and/or culturally different populations, 
whose rate of achievement is unusually rapid, and so on, present teach- 
ers with different kinds of instructional challenges and decisions. Thus, 
the types of testing given locally and the uses of test results are likely to 
vary with the demographic or achievement characteristics of children in 
the school and classroom. 

Breakdowns of sampled schools' enrollments by socioeconomic sta- 
tus (as indicated by percent receiving Aid to Families with Dependent 
Children, percent receiving free lunch, and similar indices) and ethnic 
identity were elicited from principals, Principals were also asked to 
provide contextual information on the rate of transience in school 
enrollment year-to-year and on recent general enrollment trends. 

Teachers' perceptions of the utility of tests and types of tests. As 
teachers go about the accomplishment of their practical tasks and deci- 
sions, the instances in which they refer to test scores and the ways in 
which they '*count'* or ''weigh" test scores are assumed to vary with 
their perceptions (opinions, values, understandings) ol tests and types 
of tests (See Lazar-Morrison et al., 1980; Yeh, 1980). 

Survey instruments tor teacher respondents gathered data on teach- 
ers' perceptions and beliefs about testing particular types of tests and 
testing in general. 

Teachers* experience and trainin^i. As they go about making sense ol 
particular tests' strengths and weaknes.ses. appropriate u.ses, and the 
like, teachers (the model assumes) will draw upon their formal educa- 
tional and practical experiences with respect to testing. Thus, their 
training and experience are likely to bear ultimately on their practical 
decisions about which types of test scores to use and how to use them. 
Teacher questionnaires asked respondents to report succinctly on the 
number of years they have been teaching and the number of years they 
have been teaching in their present school. (Hie latter was assumed to 
index teachers' familiarity with existing local assessment programs and 
p.ractices, socialization to local norms and values, etc.) Informal on 



ERIC 



8 



Inikouuction 



teachers* educational background knowledge and in-service training 
experience also was elicited. 

District and local site leadership action. It was assumed that innova- 
tive district and school leadership can provide in-service training exper- 
iences that change teachers' perceptions of the utility of particular tests 
and types o*' tests, thus influencing teachers' practical test-use deci- 
sions. District and school leaders can also, it was posited, act to gener- 
ate ''^<sts, testing programs, and testing practices thai facilitate teachers' 
accomplishment of their routine tasks under the practical exigencies of 
their environments (See Dorr-Bremme, 1983). Finally, district and 
school leaders may act to require that teachers use certain test scores for 
particular purposes. 

The study was not explicitly interested in how types of leadership 
action impact on types of in-service training in testing (dotted lines, 
arrows). The study was interested, however, in how leadership activi- 
ties of particular kinds impact on test use (solid line, arrows). Data on 
district-wide leadership action were collected in initial-contact phone 
interviews with district testing officials and o*" principal question- 
naires. Information on school-site leadership was gathered from teacher 
questionnaires. 

Types of tests f^iven: purposes and frequencVs Describing the tyjKs of 
tests given at local school sites was a central goal of the study. So too 
was identifying the factors that influence the purposes for tests and the 
frequency with which they are given: hence the inclusion of the domains 
discussed in the foregoing paragraphs. 

The model assumed that the types of test given locally, and the 
purposes for and frequency vith which they are given, will inlluence 
local types of test-t core use. This assumption was made for more than 
the obvious reason, that the giving of a type of test makes its scores 
available. It was also posited that the presence/absence of one type of 
test may intluence the use of scores from another type. The giving of 
minimum competency tests as a requirement for graduation, for in- 
stance, may encourage teachers to use the results of other kinds of tests 
to measure students' progress toward attainment of the mirimum com- 
petencies. (This phenomenon was observed in a junior high school 
visited during exploratory field work.) Similarly, the absence of par- 
ticular types of testing in a local setting may co-occur with more 
diverse uses of the results of tests that are given there. 

Data on the types of tests given, and on the purposes for and frequen- 
cy with which each is administered, were elicited from both teachers 
and principals, assuring a comprehensive picture of the pattern of 
testing in each school and classroom sampled. 

ERIC 1 5 



Figure 1 

Conceptual Model Guiding Test Use Survey Inquiry 



Federal/State/Local 
Requirements 



Organization of 
Curriculum and 
Instruction 



Teachers' 

Experience 

Training 



District & 
Local Site 
Leadership 
Action 



Teachers' Routine 
Practical Activities 
I and Decisions | 

I ^ I 



Teachers' 
Perceptions of 
Utility of Tests 
Types of Tests 



Types of 
Test Score 
Use 



Types of Tests 
Given; Purposes 
and Frequency 



Impacts tor. 



Types of 
Students Served 



Posited Relationships Examined Directly in Study 

Posited Relat onship Underlying Study Design, not examined Directly in Study 

1 

^ I j Domain of Inquiry, Data Collected 

ERIC zz: 

J Concept Underlying Study Design, no Data Collected Explicitly 



Introduction 



Types of test score use. Describing how scores from particular types 
of tests are actually used was another primary goal of the research. And 
identifying the factors that influence type-of-test-score/type-of-tcst-use 
relationships was yet another. 

Information on how scores from particular kinds of tests are used in 
classrooms was elicited on teacher questionnaires. Data on other 
school-wide uses of test scores was gathered on principal question- 
nai 

1 .pacts. As Figure 1 shows and as earlier discussion hat. explained, 
it was assumed that testing can have influence within schools in two 
ways. First, testing can have influence through practitioners* use of test 
scores in decision making. For example, curriculum program and/or 
instructional strategies might be changed in response to a program 
evaluation including test scores as measures of program effectiveness. 
Test scores might influence student placement decisions. Second, tests 
can impact on curriculum and instruction by virtue of their very pres- 
ence as required or rrcommended. In tho study's conceptual frame- 
work, then, both "types of test score use" and "types of tests given" 
are assumed to have potential impact. 

The conceptual model also calk attention to the study's interest in the 
impacts of particular types of testing and test-score use for learners in 
general and for particular types of learners (referenced as "types of 
students served"). The model also indie ates the interest of the research 
in impacts of particular types of testing . ind test-score use on curriculum 
and instructional activities. These potential impacts were discernible in 
the research through: 

(1) Questionnaire items that investigate the ways in which test 
scores are used. 

(2) Questionnaire items that asked about respondents' perceptions 
of the impacts of particular types of testing on their stuuents, 
classrooms, and schools. 

(3) Data analyses that examined relationships between types of stu- 
dents served (e.g., by socioeconomic condition and amount of 
testing, types of tests given, and patterns of test score use). 

The Survey Sample* 

The survey addressed a nation-wide sample of principals and teach- 
ers drawn through a successive, random-selection procedure. Given the 
study's intent to provide a comprehensive picture of current testing 
practices, sampling procedures were devised to yield a nationally repre- 
sentative sample of respondents. Stratifying variables reflected this 
concern for represeraativeness, as well as the need for variables whose 

ER?C 



iNTRODUCriON 



il 



values were cu.sily uuuinubic; these iiicluued geographic region of the 
eountry, district si^c, urban*suburban-rural locale, socioeconomic sta- 
tus, and minimum competency testing policy. The latter two variables 
also retlect the study's interest in clarifying policy issues, though the 
number of poiicy*relcvant sampling variables which could be included 
in sampling was severely limited by available information. While it 
might have been interesting to stratify the sample based on district 
leadership or types of district*rcquircd tests, for example, no prior 
information existed which would permit selections based on these 
variables. 

Respondent sampling proceeded as follows. First, a nationally repre- 
sentative probability sample of 114 school districts was drawn. (A 
lattice sampling technique was used to select cells from the matrix 
defined by the five stratifying variables. Then random sampling was 
done to select within cells.) Next, from within these districts, size 
permitting, two elementary schools and two high schools were random- 
ly selected using a procedure that facilitated (where possible) inclusion 
of schools at levels serving both higher- and lower-income populations. 
Finally, in each of these schools, principals received directions for 
randomly drawing four teachers for inclusion in the study. Directions 
for elementary principals guided the random selection of two fourth- 
grade and two sixth-grade teachers; those for high school principals 
directed the random selection of two teachers of tenth-grade English 
and two of tenth-grade mathematics. 

The principal and each of the four participating teachers at each 
school received questionnaires that elicited detailed information on 
their individual and school testing practices, as well as related contex- 
tual and attitudinal data. 

Return rales. Returns were obtained from 220 principals, 475 
elementary-school teachers, and 363 high-school teachers in 9 1 of the 
1 14 districts sampled. Return rates from all principals and from teach- 
ers at the elementary level were approximately 60%. About 50% of the 
high school teachers in the sample responded. To correct for differential 
return rales by sampling cell, and to approximate a nationally represent 
tative distribution of respondents, weightings were applied in all de- 
scriptive analyses. The results reported in the following chapters, 
therefore, represent weighted estimates of national testing practices, 
test use patterns, and principal and teacher perceptions and beliefs on 
testing-related issues. 



*A more detailed descriplion of the santpling procedures is available in Burry el al.. 
1982 



ERJC IS 



12 



iNIKUOUCnON 



Wilul wus the nature or the selected schools, their teachers and 
classrooms? In order to provide context lor understanding the results 
presented in later chapters, the remainder ol* this section describes the 
characteristics of the school environment in which the respondents 
operate and then the teachers themselves. 

The average clcmenlary school in the sample served a total enroll- 
ment of 528, comprised of a majority Caucasian but ethnically mixed 
student population. While the typical school community was economi- 
cally heterogeneous, a significant minority of students received federal 
aid and/or qualified for free school lunch benefits. Transiency and 
absence rates were relatively modest, 16 and 6 percent respectively. A 
majority of the schools (60%) operated a school improvement program, 
and student achievement testing was typically included and required in 
such programs. Over one-half of the schools operated under minimum 
competency testing requirements; while within these schools most stu- 
dents passed such required tests on the first try, a sizeable number of 
students (20%) typically experienced failure, (See Table I) 

Secondary school enrollments, as would be expected, were substan- 
tially higher, with a mean of 1439. While other characteristics were 
quite similar to those at elementary school level, students in the average 
high school in the sample appeared slightly more economcially advan- 
taged and less transient. 

The average teacher within the schools described above had approxi- 
mately twelve years of leaching experience, almost ten of which were in 
their current district, (The results are presented in Table 2.) In terms of 
their education the respondents were almost evenly split between ihose 
holding a Bachelors degree and those holding a Masters degree, with 
less than 1% holdirig a doctorate. Further, they tended to average some 
24 to 25 college units beyond their highest degree. 'Hie picture of the 
teachers then, is one of experienced, educationally qualified 
professionals who have continued to pursue education. It is interesting 
to note how similar the characteristics were across the elementary and 
secondary levels. At both levels, however, these characteristics ap- 
peared unrelated to testing practices. 

The routine of the classrooms these teachers taught in is also de- 
scribed in the results found in Table 2. The results indicate that teachers 
had in their classrooms approximately 27 students at the elementary 
level and 26 at the secondary level. At the elementary level, they 
piuvided over 6.5 hours of reading instruction per v/eek and about 5 
hours of mathematics instruction. The results at the secondary Icvrl 
were similar for mathematics, i.e., about 5,5 hours of instruction per 
week. However, fewer hours of English instr action occurred at the 



iNTRODUCnON 



13 



secondary level (approximately 5.5 hours) than reading instruction at 
the elementary level, reflecting both the greater emphasis on reading 
earlier in a student's career and the broadening of the curriculum as a 
student progresses through higher grade levels, as well as standard class 
periods at the secondary level. It will be useful to compare these 
average hours of weekly instruction with the amount of time devoted to 
testing. This is done in the next chapter, where the frequency of testing 
and the time it takes are described. 

Table L School Characteristics 

Elementary Secondary 





Mean 


s.o. 


Mean 


S.D. 


Total Enrollment 


528 


(235) 


1439 


(696.3) 


School Ethnicitv 










Black 


15.0% 


(25.8) 


15.0% 


(25.5) 


Hispanic 


8,1% 


(21.2) 


6.8% 


(18.4) 


Asian 


2.1% 


( 9.2) 


0.7% 


( 1.2) 


Native American 


5.5% 


(20.4) 


0.4% 


( 2.1) 




/U.O n) 




IO.£, to 




Other 


1.2% 


i 9.9) 


0.7% 


( 5.7) 


Socio-Economic Status 










Low income (<$8.000) 


29.0% 


(26.2) 


22.4% 


(20.2) 


Middle income 


50.6% 


(23.4) 


56.7% 


(19.3) 


High income (>$25,OO0) 


20.5% 


(21.7) 


21.8% 


(17.6) 


% of students receiving 










AFDC or free lunch 


3 1 .0% 


(26.2) 


23.2% 


(22.8) 


Transiency Rate 


15.5%. 


(13.7) 


10.4% 


( 7.8) 


Abscniec Rjc 


(1.0% 


( 9.4) 


7.47o 


( 3.7) 


School Improvenicni Program 
% Participa^ng 










59.7% 




63.0% 




% Requiring Testing 


76.3% 




65.7% 




Minimum Competency TCvSting 










Required 


53.3% 




50.0% 




% Students passing first time 


80.0% 


(23.0) 


76.1% 


(22.6) 



20 



14 



iNTRODUCnON 



Table 2. Teacher Characteristics 





Elementary 


Secondary 


Average Number of Years of 
Teaching Experience: 


12.03 


(7.50) 


2.69 (7.50) 


Average Number of Years of 
Teaching in District: 


9.68 


(6.94) 


10.04 (7.00) 


Percentage of Teachers whose 
Highest Diploma is: 

Bachelors 

Musters 

Doctorate 


57.92 
41.65 
0.17 




50.66 
48.44 
0.91 


Average Number of credits/ 
units beyond lust degree: 


24.10 


(24.39) 


25.82 (22.34) 


Average Number of students in class 


27.11 


(9.45) 


26.09 (9.84) 


Average Hours per week of 
Reading or Math: 


6.55 


(1.97) 


5.38 (1.78) 


Average Hours per week of Mathematics 


5.19 


(1.44) 


5.62 (1.67) 



21 



CHAPTER 2 

ASSESSING STUDENT 
ACHIEVEMENT: THE FREQUENCY 
OF TESTING AND 
THE TIME IT TAKES 



As CSE researchers interviewed teachers across the United States, 
they spoke of the many ways in which they assess students' progress 
and monitor the results of their teaching* Routine class and homework 
assignments, teachers pointed out, provide recurrent information on 
students' learning. Classroom interaction — during question-and-an- 
swer recitation and discussions, when students ask for help with their 
work, as they read orally or work problems at the board, etc. — yields 
immediate, continuous feedback on how students are doing. Special 
projects, presentations, and reports offer additional data on student 
progress and teaching effectiveness. Testing, then, is viewed by teach- 
ers as only one among the many strategies in their repertoire for measur* 
ing students' achievement. 

Teachers' interview remarks imply that testing means for them elicit- 
ing information from individual students, usually through paper-and- 
pencil instruments, under controlled conditions, i.e., conditions which 
preclude students' access to texts, notes, and others' assistance. While 
thisdefmition of tesving is hardly unique, it does differentiate teachers' 
view of testing from their perspective on assessment in general. From 
their viewpoint (as noted above), assessment of student achievement 
goes on constantly during the course of classroom teaching and learn- 
ing. Testing, in contrast, occurs periodically in time set aside explicitly 
for that purpose. The amount of testing that teachers report thus repre- 
sents only a small proportion of their assessment effects, an observation 
which provides important context for interpreting the following discus- 
sion on how much testing goes on in schools. 

CSE's national survey asked teachers to list each type of test their 
students receive over the course of a school year in reading or English 
and mathematics, the frequency with which each type is administered to 
their "typical student," and the approximate length of time it takes that 
student to complete a usual test of each type. Teachers' responses 



ERLC 



'5 22 



16 



AsSliSSlNG S rUDBN I ACMlbVEMENT 



provide a picture of the annual class time studcnis spend taking tests in 
these basic skills subjects. This picture is described first in the sections 
below, then it is supplemented with fieldwork fmdings that highlight 
some additional time testing entails for both students and their teachers. 

The National Picture: Modest Amounts of Time on Testing 

Elementary students spend less than 10 percent of the annual allocat- 
ed instructional time in basic skills testing. Table 3 shows the average 
annual time students devote to test taking* as well as the average 
frequency and duration of testing, in each subject and level of schooling 
surveyed. 

As these figures indicate, the typical student in the upper-elementary 
grades spends about 10 hours a year taking reading tests and 1216 hours 
a year taking mathematics tests. Test taking, then, consumes about four 
percent of the average time allocated to formal instruction in reading 
and close to seven percent of the average time given to formal instruc- 
tion in mathematics during the entire school yean (These percentages 
are based on the average instructional time reported by the elementary- 
school teachers surveyed; bVz hours a week in reading, 5 hours a week 
in mathematics. Here and throughout this section, calculations assume 
a school year of 37 weeks or 180 days of actual instruction.) 



Table 3. Time Devoted to Testing in 1>pical Classes 





Total Amount of 
Class Time Spent 
on Testing 
per Annum 


No. of Test 
Sessions for 
'lypictti Student 


Average 
Length 
of Session 


hlcincfUary Sclimil ((ttaUos 4-6) 

— KcuiJing 'Icsls 

— Mai he ma I ICS Tcsis 


y hts. 3(> nun 


22 


27 nun. 


12 hrs. 28 mm. 


23 


32 min. 


lOlh Grade English Class 


26 hrs. 34 min. 


49 


32 min. 


lUih Ciradc Maiheinaiics Class 


24 hrs. IK mm. 


45 


33 min. 



23 



Assessing Student Achievemhnt 



17 



Table 4. Time Devoted to Required Testing, As a 
Percentage of Total Testing Time For lypical Classes 





Percentage 
Time on Tesllng 
Required by 
Stale 


Percentage 
Time on lasting 
Required by 
Local School 
District 


Percentage 
Testing Time 

Devoted to 
Non-Required 

Tests 


Blcnienlary School (Grades 4-6) 

— Reading 

— Maihemaiics 


30 
21 


29 
25 


41 

!i4 


lOih Grade English Class 


12 


13 


74 


lOih Grade Maiheniaiics Class 


9 


14 


77 



Elementary students take a test in reading and a test in math about 
once every eight days. Students' lesl-laking lime, of course, is seldom 
distributed evenly from week to week across the school year. Periods or 
more intensive testing can occur at the elementary level, for example, 
during administration of placement and diagnostic measures, standard- 
ized test batteries (with their reading and math sub-tests), and end-of- 
book or end-of-level exams. Routine quizzes and chapter tests arc often 
deferred at such times or in other special circumstances. With this 
caveat, the averages in Table 3 yield rough estimates of general testing 
patterns. They indicate that throughout the year the typical upper- 
elementary student faces a half-hour test in reading and a half-hour test 
in math about once in every eight school days. 

High school students spend 12 to 13 percent of their time in English 
and mathematics class taking tests. Students in high school appear to 
spend more of their class time taking tests. Survey results reveal that 
the typical tenth-grader enrol led in an Engli.sh class spends nearly 26^/2 
hours yearly completing tests in thai subject. This constitutes a little 
over 13% of their annual time in English instruction, which teachers' 
reports indicate averages 5.4 hours weekly across the school year. 

A typical tenth-grade mathematics student devotes somewhat more 
than 24 hours to math tests in a school year. At an average of 5^2 hours 
weekly for mathematics instruction, this equals about 12% of their class 
lime. 

High school students take an English test and a math test every three- 
t( jour days. As Table 3 shows, in the subjects surveyed the average 
testing .session in tenth grade last only moments longer than in upper- 
elementary classes. On the average, however, the typical tenth-grader is 
tested about twice as frequently. He or she encounters a half-hour test in 
Lr glish class roughly every three-and-a-half days: in mathematics 
class, about once every four days. 



AssbssiNci S rum-.N 1 AcHiiivhMhN i 



Mandated tests consume substantial proportions of students* total 
test-taking time. How much of the lesl-iaking lime just described results 
from tests mandated by agencies beyond the school? How much occurs 
at teachers' discretion? Table 4 provides answers to these questions. 
Elementary-school teachers in the sample report that on the average 
about half their students' test-taking time in both reading and math is 
spent on measures required by their state or school district. At the high- 
school level, state and district mandates account for about a quarter of 
the time students spend taking tests in both English and mathematics. 
Notice, then, that since high school students on the average spend twice 
as much time annually being tested as elementary students do, these 
percentages suggest that the actual number of hours spent in required 
testing is quite similar at both levels of schooling. Notice, too, that a 
greater proportion of assessment in the high school subjects is 
voluntary: conducted at the discretion of the inuividual teacher. 

Students spend most of their time on teacher-developed tests. Which 
types of tests call for greater proportions of students' test-taking lime? 
To address this question, the survey employed test-type categories that 
recurred consistently and spontaneously in the talk of teachers, school 
admini.strators, and counselors during open-ended pre-survey inter- 
views. The goal was to give survey respondents a categorization system 
as similar as possible to the one they use naturally in their everyday 
thinking and conversation about assessment. As Table 5 demonstrates, 
this system differentiates tests piimarily in terms of their point of 
origin, i.e., according to who develops the measure and/or requires its 
use. 



Vdhle 5, Time on DiiTerent Tests, As a Percentage of the 
Total Student Time Devoted to Test-Taking 





Elementary 
Teachers 


10th 
Grade 
Knglish 
1'eachers 


10th 
(rrade 
Mathematics 
Teachers 


TYPK OU KST 


Reading 


Math 


Tests vshkh form pari ot a 
stale wide assess me III program 


3 


\ 


5 


1 


Required Mininiuin (\>mjKtenc> I'csis 


1 




1 


I 


Tcsis iiK luded ttilh eurnt uluni 
materials 




35 


S 


17 


Other ciunniereiall) puhlished le^ls 


17 


IK 


6 


3 


l.tViillv developed ami dislncl 
aiiopied tests 


13 


K 


5 


"> 


SchiH)! or teacher developed icsts 


37 


35 


74 


76 



er|c 2.5 



A.SSl-.SSlNd Sll'DI Nl Ac HIl.VJ.MI.M 



19 



A glance at the results in Tabic 5 shcnvs imnicdiaicly thai tests 
developed hy individual teachers and schools and, at the elementary 
level, lhi)se which acconipany commercial curriculum materials, oc vu- 
py the great majority of students' testing time. Notice that these are the 
types over which teachers have most control. They can administer them 
when they deem appropriate; they can design (or readily adapt) the 
content to suit tlieir own teachmg emphases. Most teachers interviewed 
said that these types of tests fit best with their instructional schedules 
and curricula. And, from their points of view, these are ihc most valid 
instruments of those listed for such routine tasks as grading, on-going 
planning of teaching, etc. (This will be discu.s.sed further in Chapter 3). 
The predominance of locally developed tests at the .secondary level 
supports ilie notion that high .school teachers have more control over 
classroom assessment than do elementary school teachers. But heavy 
use of locally developed tests in the high schools may also reflect the 
limited rumber of suitable commercial testing materials available. 
( omprehensive curricular programs — including texts with coordinat- 
ed workbooks, tests, etc. — are more widely available for teachers of 
the elementary grades. 

Finally, note that the two types of testing most often generated by 
Mate pi)lic\ — minimum competency testing and state as.sessmenl - 
consume on the average very small proportions of classroom testing 
lime. 

The figures in lable 5 are averaged across all teachers in the survey, 
including tlu)sc in states without tninimum competency testing require- 
ments. Kvcn where minimum competency tests (MC'T) are required in 
the grades sampled, however, less than three percent ot the testing time 
at the sampled elementary grade levels and two percent of the testing 
lime in secondary grades and subjects sampled is taken up by these 
tests. Where NUT\s are available, but not required. the\ absorb less 
than one percent of the total testing lime in the grades and subjecls 
survcvetl. 

The picture with regard to statewide assC Ament programs is snnilar 
Such programs require no more than ihrer percent ot the total annual 
testing time at the elementary level (or about 45 minutes per year on the 
average lor reading and mathematics combined). At the high school 
level, tenth grade l!nglish assessment programs typical l\ take about 75 
minutes annual l\ and mathematics programs an average of M) PMnuics 
pel year 

Where ilicrc Krc nn \h{fi' niininufiu < (mpctnh v. profit icinw or linn - 
lionai lilcracx irsfim: ^Cijuircnioits, stiulcnts spend more time (Ui eluss- 
numi (lehieventen! ft'stina. Tests of minimum competcncv or proticien- 
c\ or functional litcracv are now requi'Vvl of all students in over 40 



ERIC 



2(5 



20 



ASSI;SSIN(1 S'I'UDhN'I* Ac'HIhVl.MhNI* 



Slates, rcprcscnling about two-thirds of the nation's student enrollment. 
In some states, passing these tests is a prerequisite for promotion to 
eertain grades and/or for high-sehool graduation. In others, they are 
mandated only for diagnostie purposes: to assure that students with 
defieiencies in basic skills are identified and offered remedial instruc- 
tion. Furthermore, some states designate specific instruments that must 
be used in minimum competency testing, while legislation in other 
states permits local school districts to select or construct tests of their 
own choice. 

leachers' reports suggest that these minimum competency require- 
ments may somehow be affecting the amount of classroom achievement 
testing teachers otherwise do. At least, teachers' survey reports show 
that, when other sampling factors are controlled,'*' students in states 



lable 6. Relationships Between State Minimum Competency 
lesting Requirements am* Students' Test-Taking Time 

Reported in Minutes 



srvn: 

RKQl IRLMKNT 


SKC()NI)AK> 




i:i.KMKNTAR> 


Knulish 


Ntath 


Total piT 




Kn^lish 


Math 


Total r 
Teacher^ 


IvsHml* I MCI 1 






U^.s 0} 




.^77 4^ 


S7() ^1 


II4X.U 


MC 1 I^\jllJK\l Um 
ni.llKl.lk'ii Mk'.lMIK 












4SS IS 


w:: 4s 


Mt ' 1 u*i|u»u\I lot 




I 7 


14x: 77 




4SW Wi) 


4Mi ^: 




[MiiMliiMiMI Dl L'l.hi 
nU .IMiK 


1 4:" ^ ^ 












W7I S7 


piotth'tion Mi 

Jt.lliu.tl li>M. liK.ll 
















! )ii Ik u Ilk k 111 Ilk III ^ .ihi. « •>! tl It 1, 




-lU 1*1 U V H 


.11 <Ht u .lih si-'ii 


ii. .mt .it ]> 


ti! 





i)Mltuiui lit III • iii>i Hi -pitu ml Hi.iiistit .iil\ 

OlIkM l.ivlois vonskkivil u) s.M)i[^hni: (iivliuk- ^lis'.ikluklc Mk loci ononiiv sl.iUis. Jin 

llkl Olliol jllk-nl Nl/v*. !.\*OLM.ipl)k U'lIli'M Ml ilk" ll.lhoii. iiil\in Mllnilh.JM llli.ti liH.\ilc 
SvV llv HllUullKlUUl lol tUlllk'l JcKllK 

ERIC 27 



ASSI£SSIN(? Sl UDhNr A(*HIHVHMKNT 



V 



with no minimum competency requircmenls at all spend more lime on 
achievement testing each year than students elsewhere do. (See Table 
6.) This difference is dramatic (and statistically significant) at the 
secondary level, where all types of minimum competency requirements 
appear to be accompanied by much less classroom testing (from 33 to 
45 hours less annually) and where competency requirements for promo- 
tion or graduation are accompanied by the least testing time of all. 

At present, this pattern is difficult to explain. On the surface, it 
seems to suggest that teachers have eschewed routine classroom testing 
in favor of minimum competency measures: that they are permitting 
minimum competency tests to take place of o^.her forms of assessment. 
This interpretation, however, makes little logical sense. Proficiency or 
minimum competency tests are given only at certain grade levels. 
Typically, too, they arc given in those grades only on a single occasion. 
Thus, they cannot possibly supply the feedback on student performance 
that teachers need regularly for monitoring students' learning progress, 
assii^ning report card grades, making on-going teaching plans, and so 
on. Furthermore, fieldwork visits to various states with different mini- 
mum competency requirements revealed no reduction in routine tests 
and quizzes. In fact, fieldwork suggested that at least in the districts 
visited, additional tmie can be spent in testing to assure that students 
perform well on minimum competency measures. Nevertheless, careful 
review of the survey instruments and the statistical analyses to which 
they were subjected substantiates the findings displayed in Table 6. The 
processes that underlie and explain these results await further study. 

Socioeconomic status (SES) seems unrelated to students* test-taking 
time. Given the evaluation and testing requirements that are commonly 
associated with compensatory education programs, and given that these 
programs serve students from lower socioeconomic backgrounds, many 
people have speculated that lower SES students spend more of their 
school time on testing than students from higher SES homes. CSb 
survey results, however, indicate that this is not the ca.se. Students in 
lower *'ES areas do not spend more time taking tests than those in 
middle-income or upper-i ncome settings, nor do they even spend more 
lime taking tests required by their district, their state, or in conjunction 
with federal educational program guidelines. This finding holds true 
regardless of whether a district-level or a school-level indicator of 
socioeconomic status is used. 

In concluding this section, it is also worth noting that no other 
variable included in this study (except minimum competency require- 
ments) appeared to have any relationship with the amount of time 
students spend t.jking tests. 



Assessing Student Achievement 



The discussion so far has centered on how much testing goes on in 
the basic-skills subjects of reading or English and mathematics across 
the nation's schools. Emphasis has been on the frequency of testing and 
on the class time students spenJ with tests in hand, actually completing 
them. Survey questions purposely focused on these topics as especially 
relevant to a portr:.it of national practices.* Fieldwork results elaborate 
these findings, providing an illustrative look at all the time students 
spend on testing, at teachers' testing time, and at time on testing across 
the curriculum. 

Test in f{ consumes student time before and after the test. In most 
classrooms, testing demands more class time than that required for 
students to complete their tests — time which spent both before and 
after they answer test questions. Wide-ranging interviews with teach- 
ers, conducted by CSE both before and after the national survey, illus- 
trate how this time is spent and ho^ much it can add up to. 

Preparations fcr testing cun begin days or even weeks before the test 
is given. At a minimum, teachers inform their students when the test 
will be, explain what it will cover, and say a wo'-d or wo about the 
question formats that students can expect. When m^^nd^ ed measures 
such as standardized batteries or minimum-competency tests are due, 
however, some teachers spend class time to train students in their 
specific response formats and/or in general test-taking strategies. Some 
also suspend teaching of the on-going curriculum, devoting class time 
instead to review and practice of skills and content that they know these 
tests will cover 

When the testing day arrives, of course, time is required for passing 
out materials, giving directions, and handling students' questions. In 
order to provide an appropriate environment for testing, some teachers 
say, they routinely allow several moments for ^'.settling students down'' 
and/or rearranging students' seating. Filling in student-identification 
information and covering directions can be especially time-consuming 
at the outset of special testing episodes. At the elementary level, teach- 
ers often report spending a half-hour or more on these preliminaries 
when standardized testing, state assessment, or minimum competency 
measures are administered. Moving students from their classrooms to 
special testing locations (♦he library, cafeteria, etc), as is sometimes 
done for the latter types of assessment and for high-school finals, is 
another before-testing activity that can take up lime. 



*In additiun. pnijcct resources were insutfieicni to examine testing in all subject areas, 
and both pre-survey interviews and questiunnaire piloting conririned that eliciting 
infiirmation on all the time asstKMated with preparmg tun taking, and reviewing tests 
wtiuld place an enoinuius resptinse burden on survey recipients. 

Er|c ,p(i 



Assessing Student AcHiHvuMiiNT 



23 



Once sludcnls have completed a lest, class lime is given over lo 
collecting papers. Sometimes, tests are corrected in class. Then, if 
necessary, regular classroom seating patterns are restored. Nearly all 
teachers in the elemenli*ry grades report that they regularly set aside 
time * ^T students to ''relax" or ''cool out'' after particularly important 
or lengthy examinations. Some high schools accomplish this with spe- 
cial schoolwide schedules for finals and (less often) mid-terms. 

The amount of class time such activities as these consume appears to 
vary markedly from classroom to classroom and school to school. In 
two elementary schools, for example, every teacher in grades K through 
6 was interviewed about all the time their students spend on test-related 
activities in all subjects throughout the school yean In one of these 
schools (Hillview Elementary), students usually spend an average of 
91% of their total, testing-related time actually answering test ques- 
tions. Only 9%, on the average, of the typical student's total time on 
testing each year is taken up with before-the-test and after-the-test 
activities of the kind described above. In the second elementary school 
(Cityside), however, much more time is routinely spent on pre-testing 
drills and review which, teachers avowed, were undertaken oniy be- 
cause mandated testing was about to occur. Furthermore, logistics in 
support of testing — scheduling changes that reduced class time; room 
reassignment for testing, etc. — claims a great deal of instructional 
time during required-test administration each spring in this densely 
populated school. Thus, students here spend only 55% of the average 
annual time devoted to test related activites actually taking tests. They 
devote nearly as much time each year, in other words, to before-the-test 
and after-the-test activities as they do to test taking. (For details on 
these two schools, their testing programs, and their districts' testing 
programs, etc., see Dorr-Bremme, 1983.) 

Similar interviews were conducted, although less intensively in any 
one school, with high school teachers. These suggest that secondary 
students usually spend 10 to 15 percent of their total yearly testing time 
in any one class on before- and after-testing activities. 

The percentages offered here, of course, are only illustrative. Never- 
theless, they do provide a useful context for interpreting the national 
averages of students' test-taking time cited earlier 

In two elementary schools, testin}> across the curriculum consun'cd 
eiaht to ten percent of students available instructional time. How 
much time do students spend on all test-and-testing related activities in 
subjects across the curriculum? Fieldwork interviews in the two schcx)ls 
mentioned in the last section also provide illustrative answers to this 
question for students in elementary school. In the first of two schools 



.30 



24 



AsshssiNG Si udi:nt AcHiiivtMi-Nr 



(Hillview), for instance, an average student devotes 88 hours a year 
preparing for, taking, and winding i:p and going over tests in all sub- 
jects. This comprises about 10% of their annual class time (which 
equals five hours daily, excluding lunchtime and recess, over 177 
school days, or 885 hours per year). Across classrooms in the other 
elementary school cited above (Cityside), students' total testing time in 
all subjects averages 76 hours a year, or 8.6% of their annual cla,ss time 
of 885 hours. Observations of testing episodes — including the before, 
durin^., and after phases — suggest that the interview estimates upon 
which these totals are based are generally quite accurate. 

Tables 7 and 8 show how this time is distributed by subject area. 
Notice that all teachers do not test in all subjects and that testing in the 
basic skills subjects of reading and mathematics (not including multi- 
subject batteries which also cover these sobiects) consumes about 50% 
of students' total time on te.sting in these two schools. 

For each hour that students spend taking tests, teachers seem to 
spend two'to-three more. The annual times students spend on test- 
taking (Table 3 above) can serve as a rough indicator of the times that 
teachers spend giving te.sts in the clas.sroom. CSE's interviews with 
teachers confirm that in most ca.ses teachers actively monitor the cla.ss 
and an.swer students' questions as te.sting is in progress. These same 
interviews, however, suggest that teachers spend only about a quarter to 
a third of their total time on te.sting in this way. That is, for each hour 
they devote to giving a reading or math test, they typically spend 
another two or three hours on such activities as preparing for testing 
(e.g., con.structing tests and dittoing them, reviewing directions for 
state as.se.ssnient or .standardized-test administration), correcting and 
grading tests, recording scores, etc. At the elementary level, teachers 
also find that they spend a good deal of time checking over .special 
answer sheets used for machine scoring to be sure that the identification 
information is correct, that there are no stray pencil marks to throw off 
the .scoring, etc. 

Interviews with elementary-school teachers indicate that they .spend 
about 12 U) 15 percent of their annual reported work time, both in and 
out of school, on achievement testing in all subject areas. This averages 
about 200 to 250 hours through a .school year. (Similar figures are 
unavailable for high-school teachers, but they do appear to spend two 
hours or so outside of class for every class hour of student testing. ) 

lables 7 and 8 aLso display the total time on te.sting that teachers in 
the two ca.se study elementary schools (Hillview and Cityside) spend 
annually on testing in each subject. Note that te.sting in reading and 
mathematics together demands over 50 percent of the total teacher time 



ERLC 



31 



Assessing Student Achievement 



25 



on testing at each school. If the testing in these subjects that takes place 
as part of multi-subject batteries were included, this percentage would 
be higher 

Other staff members' time on testing. Administrators, as well as 
classroom aides (or paraprofessionals) and volunteers, also play a role 
in the work of testing. Classroom assistants spend their time much as 
teachers do: proctoring test administration, grading tests and recording 
scores, etc. School administrators typically spend their time coordinat- 
ing major schoolwide testing programs: overseeing distribution, ad- 
ministration, collection and checking of state-assessment measures, 
standardized testing, and/or minimum-competency (proficiency) as- 
sessment. (See Tables 7 and 8 for the time administrators and classroom 
assistants spend annually on all aspects of testing in the two case study 
schools.) 



32 



Table 7. Hillview School Littleton District 
Distribution of Staff & Student Testing Time 



Each Staff category cell shows: 

• No. of stiff members involved 

• Avg. hours/staff member/year 

• Total testing lime for 
staff category 



SUBJKCT 
ARKAS 


ADMINISTRATORS' 
TIMK 


CLASSROOM 
TKACHKRS' 
TIMF 


INSTRUCTIONAL 
SPECIALLSTS- 
TIME 


VOLUNTEERS- 
TIME 




TOTAL STAFF 

TIME (In 
Person Hours) 




AVG, STUDENT 

TIME PER 
STUDENT (hours) 


NUMBER OF 
CLASSROOMS 
TbUil = 30 


Reading 




)} 

52 47 
>() m 


1 

17.4 
8 8*)^ 


1 

5 0 




599 6 

19 00^ 




12.12 


II 


Mathematics 




11 

77.11 
M)y>l 


1 

53 9 

27.3';? 


3 

15.44 
59.7<X 




948 46 




25 II 


11 


language Arts 




8 

24 M) 

7 in 


1 

34 75 

17. ft';?- 






229 17 

7.3^ 




7.81 


K 


Spelling 




8 

51 42 
14 8^ 


1 

21 58 
U) 9*;^ 






432 97 

13 7'>{ 




19 34 


8 


Sociiil Studies 




5 

19 55 








97.75 
3 iO^ 




4 53 


5 


Science 




5 

28 0 

5 ir^ 








140 0 
4 




5 8 


s 


Health - Phys 




\ 

8 33 

0 9^ 








25.0 
0 8^ 




7.19 


} 


Other. 
Miscellaneous 




\ 

8 f>l 

I III 


1 

70 0 






95.83 




3 ^9 


3 


MuUi'Subjcci* 


2 

49 87 
UX) O'? 


11 

42.()ft 
lb (^^ 




3 

8 7H 
33 9<5? 




588 77 
18 m 




23 93 


11 


lOlAI.S Hy Staff 

category 
(In person hours) 


99 75 
|(K) ()'* 


2782.5 
|(K) m 


197 63 
UK) m 


77 66 
UK) ()';? 




3157 55 
99 9** 




4 I 




•The Mulli-subject category mcludcs standardized tets which assess pcrlormance in several subject areas Also included in this category \s the general mlelligence test given tv^ice a year at the 
same time as (i c . on a day contiguous v^ith) the standardized test. Some res(X)ndents reported time devoted to the intclltgeiKC tests as separate from that given to the standardt£ed test, otherv did 
not Thus, time devoted to both is collapsed l»cre 



Thble 8. Cityside School — Metro District 
Distribution of SUIT & Student Iteting Time 

ect 



SUBJECT 
AREAS 


ADMINISTRATORS* 
TIME 


CLERICAL 
TIME 


CLASSROOM 
TEACHERS* 
TIME 


INSTRUCTIONAL 
SFECIALISTS* 
TIME 


AIDES* (fara* 
TIME 


VOLUNTEERS* 
TIME 


TOTAL STAFF 

TIME <la 
^nofl Honrs) 


AVG. STUDENT 

TIME PER 
STUDENT (lM«n) 


NUMBER OF 
C\ASSROOMS 
TMal«3i 


Reiding 


2 

139.66 
74.5% 


1 

10.3 
100.0% 


28 

54.61 
25.6% 


1 

74.0 
45.0% 


26 

15.31 
30.7% 


1 

11.67 
12.6% 


2302.42 
28.8% 


9.43 


28 


Mathematics 






27 

67.58 
30.5% 




25 

15.51 
29.9% 


2 

33.06 
71.8% 


2278.38 
28.6% 


21.01 


27 


Language Arts 






16 

25.42 
6.8% 




10 
3.63 
2.8% 




443.0 
5.5% 


18.71 


16 


Spelling 






22 

54.25 
20.0% 




18 

11.17 
15.5% 


1 

9.17 
10.0% 


1403.67 
17.6% 


25.83 


22 


Social Studies 






10 

17.65 
2.9% 




6 

4.12 
1.9% 




201.20 
2.6% 


10.33 


10 


Science 






5 

16.4 
1.4% 




2 

0.63 
0.09% 




83 25 

1.0% 


4.3.1 


5 


Health — Phys. Ed 






6 

16.55 
1.7% 




6 

9.52 
4.4% 




156.47 
2.0% 


30.28 


6 


Other, 
Miscellaneous 






6 

40.27 
4.0% 


1 

74.0 
45 0% 


4 

10.34 
3.2% 




356.96 
4.5% 


0.19 


6 


Multi'Subject 


31.90 
25.5% 




26 

16.24 
7.1% 


2 

8.16 
10.0% 


28 
5.39 
11.6% 


2 

2.6 
5.6% 


690.45 
9.4% 


9 62 


26 


TOTALS By staff 

category 
(In person hours) 


375.0 
100.0% 


10.3 
100.0% 


5975.32 
100.0% 


164.33 

\oo.in 


(298.5 
100.09%. 


92.22 

lOO.OOf 


7915.8 





Each lurr caleioiy cell ihowt: 
. t No. of staff members involved 
, * Avg. houn/stiff member/year 

* % Ibul tetting time for 



ERIC 



34 



CHAPTER 3 
USING ASSESSMENT RESULTS 

The results of tests and other assessment techniques can be used tor 
many different purposes by educators in the schools. Nearly all educa- 
tional testing and measurement texts include long lists of these: diag- 
nosing learners' needs, placing students in programs, monitoring stu- 
dents* progress, evaluating curriculum and instruction, planning for 
school improvement, reporting to parents, satisifying accoi*:itabil ity 
requirements, and many others. Such lists outline the possibilities. 
JSE*s Test Use in Schools Study sought to identify actual practict^s. 
Thus, both principals and teachers were asked how heavil> they wei(,n 
different types of test results and information from other sources m a 
variety of routine decisions and tasks. 

Figure 2, an example from the teacher survey, illustrates the form 
these questions took. 



Figure 2 

Format of Survey Test-Use Questions for Teachers and Principals 
Illustration from the Teacher Survey ^ 

22. When I initially group or place students for 
instruction, here's how important various 
sources of information are to me: 

(a) Previous teacher's comments, reports, 
grades 

(b) Students' standardized test scores 

(c) Students' scores on district continuum or 
minimum competency tests 

(d) Results of placement tests included with 
curriculum use 

(e) Results of other special placement tests . 

(f) Results of tests I make up 

(g) My own observations and students' 
classwork 




4 .3 2 1 0 
4 3 2 1 0 



4 3 2 1 0 

4 3 2 10 

4 3 2 10 

4 3 2 1 0 

4 3 2 1 0 



30 



UsiNCi AsSIiSSMI-NT Rl'.SULIS 



The same formal was followed in the questionnaires for principals. As 
in the example, each question about a particular use of assessment 
elicited information about a range of test types and about other modes 
of assessment, e.g., observations and classwork, as well. Notice that 
the test-type categories given in these questions are identical with those 
employed in survey questions about students' testing time (Table 5 
above). Recall that these were the test-type labels teachers and princi- 
pals used recurrently, without prompting, during the opjn-ended, pre- 
survey interviews conducted in several school districts across the Unit- 
ed States. It is highly likely, therefore, that most survey respondents 
found them familiar and meaningful. 

Practically, the survey could not examine all the possible school and 
classroom uses of assessment results. Cho;ix\s had to be made in order 
to I.eep questionnaires at a reasonable length. Pre-survcy interviews 
played a major rule in guiding these choices. One of these interviews 
asked respondents to name all the achievement tests that they gave their 
students through the school year, then to describe what (if anything) 
they did with the results. The second interview form encouraged infor- 
mants to discuss the major tasks and decisions their jobs routinely 
entailed as a typical school year proceed- .^d; ii thvn inquired about all the 
information that informed each task and det .»ion. These interviews 
made it possible lo identify: ( 1 ) those tasks and decisions that teachers 
and principals considered to be major responsibilities in their respective 
jobs: and (2) tho.se for which principals or teachers were inclined to 
consult test scores or other assessm^jnt informati.)n. Thus, within space 
conlrainis. the survey questionnaires were ahle to focus on major tasks 
and decisions in which test results were likely to be used. 

Below, the findings from the principals and teachers questionnaires 
arc described and discussed separately, then supplemented with infor- 
mation from fieldwork interviews. 

A V\ ide Variety of A.sses.sment Results Pluy a Uole In School-Level 
Tasks, But Itachers' lests and Their ^Professional Judgments Are 
Most Important* 

Principals ilescribed tlic imporlancc of diftereni types of assessment 
results in ei>^h school level tasks ard decisions. Tabic lists these and 
shinxs ihc pv a^.iiages of princip^^' -.ho stated that the different types 
of assessment information were crucir.i or nnportant in each task. Table 
ID displays the same data in a diffeioni form: as the mean (or average) 
iiiiporiance rating principals gave each type o\ intormation for each 
task. 



Using Assessment Rlsul is 



31 



Notice that both tables report the use of five main types of assessment 
results: those that come from ( 1 ) standardized, norm-referenced batter- 
ies; (2) minimum competency (proficiency) tests; (3) tests referenced to 
district curriculum objectives; (4) teachers' classroom tests and assign- 
ments (unit or chapter tests, quizzes, finals, whether teacher-construct- 
ed or included with published curriculum materials); and (5) teachers* 
observations of and interactions with students and/or their professional 
judgments. In fact, however, principals were also asked to rate the 
importance of other types of information for five of the eight tasks. 

Table 9. SchooULevel Uses of Test Results and Other Information 
(Percentages of Principals Reporting Use of This Information 
as Crucial or Important for the Specified purpose) 



Task or Decision Information Source 





A 


B 


c 


D 


E 


F 






HUiMENTARY 






Curriculuru Planning 


78 




65 


72 


88 




A.ssii!ning Sludcnls Ui 














Classes 


47 


30 


38 


74 


84 


4ga 


Teacher Evaluation 


16 


i 1 


25 


40 






Allocating Funds 


2« 


21 


29 




81 


77c 


Student Prnntotion 


5\ 


36 


48 


84 


96 


94d 


Inlorniing the Public 


72 


38 


41 


42 






Communicating to Parents 


78 


56 


63 


98 


95 


92^* 


Reporting to District 


81 


r^5 


58 


53 












SECONDARY 






Curriculum Planning 


74 


75 


57 


63 


84 




Assigning Students to 












76^ 


Classes 


72 


64 


45 


75 


80 


Teacher Hvaluaiion 


20 


15 


21 


43 




gsb 


Allocating hund.s 


24 


28 


21 




94 


84^; 


Student Proimnion 


24 


48 


26 


84 


76 


96* 


Inlornting the Public 


74 


63 


43 


47 






C\iitti!tunicaling to Parents 


7*^ 


69 


45 


96 


94 


<)7t 


Rci^oriing to District 


86 


72 


S6 


60 







A • HcMili'* t»l siandaijj/cd. mum rclcrciKCil hallcrics 
H RcnuIIn of nuniimini viim}x*li*iK> (nruiu iciK> ) Icsis 

KiNuIls ot tcjchciN* claNsriK)m IcnK and asMj.MiMU*»i\ 
I Icahcr* opinnwis. |udj»incnis. n\iMuim*nd.jlKMiN 

f" VjriDiis nihcr souivc* as tnllim*. 

.1 viudcnt* p.isl ttassruoin bch.iMui 

J uhsci N.iiums of Icai hers u.khni^' 

NfWilK dir».\tnms trmii disiiM 

(t ».!.iNsNMiik th)()U}:hiH)t I he 

I stiuKm N icj>»iil cud j-M.ulo 

Nt»i askvd 



37 



32 



Using AsshssMhN i Rusults 



Table 10« Importance or Test Results and OV r Information In 
School-Level Tasks and Decisions 
(Mean Ratings by Principals on i Four**Point Scale)* 



Decision or I ask 


A 

A 


B 


c 






r 






liLHMHNTARY 






(*urriculum Planning 


3.01 + 


2.91 


3.04 


2.99 


2.94 


3.27 




(.67) 


(.7,S) 


(.87) 


( ()7) 


(.84) 


(.64) 


Assigning SiudcniN lo 


2.30 


2.35 


2 46 


2.44 


2.93 


3.12 


Classes 


(.81) 


(.91) 


(.99) 


(.08) 


(.79) 


(.71) 


Teacher Evalualii)n 


1.70 


1.5.^ 


1 80 


1.68 


2.12 






(.76) 


(.78) 


(.93) 


(.14) 


(.97) 




Allocaiing I-unds 


1 .*^1 


1.89 


1.94 


! 91 




3.08 




(.87) 


(.90) 


(1.01) 


(.03) 




( . / 1 ) 


Suideni FronioiiDn 


2.65 


2.31 


2.38 


2.45 


A . ( )5 


3.29 




(.81) 


(.96) 


(.94) 


(.18) 


( .70) 


(.67) 


Informing ihe I'ublie 


2.77 


2.47 


2.,U 


2.52 


2.31 




( 90) 


(.99) 


( 1 .(H)) 


(.22) 


( 1 .05) 




(,'omnumicaling U) Parents 


2.91 


2.64 


2.67 


2.74 


3 .4,1 


3.45 




(.60) 


1.98) 


(.95) 


(.15) 


(.55) 


(.57) 


Keporling u> Disiriel 


.V 12 


2.78 


2.74 


2.88 


2.62 




( .68) 


(1.10) 


(1.10) 


(.21) 


(.91 ) 








SI'iCONDARY 






C urrieuluni Planning 


2.8.A 


3.27 


2.95 


3.02 


2.76 


3.14 


(.67) 


(.64) 


(.82) 


(.23) 


(.75) 


(.70) 


Assigning Students in 


2.77 


2.98* 


2.78 


2.84 


2.98 


2.99 


Classes 


(.77) 


(.87) 


(.87) 


(.12) 


(.73) 


( 79) 


I'eaeher bvaluutiiui 


l.o3 


1.77 


1.84 


1.75 


2.39 






(.74) 


(.71) 


(.78) 


(.11) 


(.83) 




AlliK-atnig Funds 


1.7.^ 


2.20 


2.06 


2.(K) 




3.34 




(.81) 


(1.1.^) 


(1.08) 


(.24) 




(.54) 


Student Promotion 


1.61 


2.58 


2.05 


2.08 


3.33 


3.46 




(.78) 


(1.28) 


(1.13) 


(.49) 


(.85) 


(.75) 


InftMnung the Public 


2.84 


2.92 


2..^() 


2.69 


2.24 






(.80) 


(1.03) 


(1.07) 


(..U) 


(1.05) 




C\inuiunucating to Parents 


2.91 


.vo- 


2.. 55 


2. S3 


3.56 


3.38 




(.,'>8l 


(l.(K)) 


( 99) 


(.25) 


^55) 


(.76) 


Reporting ti^ District 


.VIO 


3.12 


2.92 


3.04 


2.53 






<.64) 


1.97) 


(.95) 


(.11) 


1.88) 





Sundanii/cii. nurni iclcrciual KnI KiIUtk-s 
Mininiuni ( i»in|vtciK \ Ic^K 

Ri'NuIlN o\ "leather Currii iiluni tcsiv 
lc,Khv.T OpmuMK Kcinniin- a.itmnN 

j4 jituiu Ncalc -1 ("nn.iai i.iiptin.iiKi" I rmnipnti.ini i*i n.»r usfdl 
NnitiKrs in [».irv.nihv*vCN .ik Nf.ind.iui ^^^,■^l.ltM»n^ 



38 



■\ 

li 

(• 

I) 

I 

I 




Using Assessment Results 



33 



Table 9 (Column F) shows which of these other types of information 
most principals considered crucial or important for each of those five 
tasks, as well as the percentages who did so. For the sake of simplicity, 
these data are omitted in Table 10. 

As the tables indicate, most schools appear to ground their a^^ ^ons 
upon several information sources in all eight tasks or decisions. In 
general (Table 10), no one stands out as markedly more important than 
all the others for most tasks. For almost every task, however, principals 
rate the results of teachers' classroom testing as more crucial or impor- 
tant more often than the results of any other type of paper-and-pencil 
measure (See Table 9). What is more, for each of the eight tasks listed, 
teachers' opinions, judgments, and recommendations clearly carry 
more weight than any type of test results. 

Some types of measures listed on the survey are more formal tests; 
standardized, norm-referenced batteries, other kinds of minimum com- 
petency measures,* and test referenced to districts' instructional objec- 
tives. Compared to teacher-made tests and class assignments, great 
attention is usually given to their psychometric quality and their admin- 
istration is usually marked by more formal or '^officiaP' testing ar- 
rangements and procedures. Usually, loo, the*e tests are given in 
schools at the mandate of an agency beyond the school, e.g., by the 
district, the state or, even by the federal government as part of the 
requirements for a specially funded program. 

The results of these formal tests appear to make their greatest contri- 
bution in three school-level tasks: curriculum planning, communicating 
to parents about their children's achievement, and reporting to school 
district administrators. Conversely, formal test results are least impor- 
tant in evaluating teachers and in allocating funds within the school for 
such things as personnnel, equipment, and materials. In secondary 
schools, formal test results, and especially the results of minimum 
competency or proficiency tests, also play a significant role in decisions 
about students' class assignments. Fieldwork indicates, for example, 
that students who fail to meet minimum standards on competency tests 
are sometimes assigned to special courses designed for remediation in 
the basic skills covered by the tests. 

Standardized, norm-referenced batteries seem to be the most influen- 
tial of the formal reqi ired tests at the elementary level. However at the 
high school leveK educators pay more attention to the results of mini- 
mum competency tests than to those of the other types of formal 
measures. 



*ln some slates ;iiid dislricts. slandardi/cd. norm-rct'crcnccd measures are used as 
minimum compeiciK ) or proficiency lesis. 



ERLC 



34 



Using Assessment Results 



The Results of Formal Tests Are Deemed More Important In 
Schools Serving Students of Lower Socioeconomic Status (SESS) 

An earlier section (page 2 1 ) noted that students in lower SES schools 
do not spend more time taking tests than middle or upper-income pupils 
do. Furthermore, teachers' classroom uses of test results (to be dis- 
cussed next) do not vary systematically or significantly with students' 
socioeconomic status. In schoolwide or school-level tasks and deci- 
sions, however, tests results do appear to have greater impact and wider 
consequences in lower SES schools than they do in higher SES settings. 
In the former, principals report that more importance is accorded the 
scores of formal tests — especially minimum competency measures 
and district objectives-based tests — in planning curriculum, deciding 
on students' class assignments, allocating school funds, and reporting 
on school achievement to the public-at-large, parents, and district offi- 
cials. (See Table 11, which shows the results for all principals, 
elementary and secondary together, divided into higher and lower SES 
groups using school-level indicators.) 

For Classroom Tisks, Teachers Place Most Weight on Their 
Observations and the Results of Their Own Tests* 

Teachers were asked to rate the importance of the results of various 
assessment types in four routine classroom tasks Oi decisions. The 
proportions of elementary and high school teachers who described 
different types of results as crucial or important in each is displayed in 
Tables 12 and 13. Table 14 portrays similar data in a different form: as 
the mean (or average) rating teachers gave each type of information for 
each of the four tasks. Notice that Tables- 12 and 13 divide teachers' 
responses by subject matter, while Table M does not. 

These tables demonstrate that teachers do use test results of various 
types in making common instructional decisions. They also reveal quite 
clearly, however, that teachers place greatest trust in their own observa- 
tions of students' class performance and in their personal, clinical 
judgment. Nearly every teacher reporting sayi* that their "own observa- 
tions and students' classroom work" are crucial or important sources of 
information for initially grouping or placing students, in deciding to 
change students' placement or grouping, and in determining students' 
rcpiirt-card grades. The great majority also give heavy weight to the 
results of their own, self-constructed test in each of these tasks. Among 
teachers in the elementary grades, ''the results of tests included with the 



ERIC 



40 



Using Assessment Results 



35 



Table 11. Importance of Test Results for School Decisionmaking 
in Schools of Higher and Lower Socioeconomic Status (SES)'*' 

HIGHER SES 





Sundardizcd 


Minimum 


District Otdeclivcs 


Average 




Norni*referenccd 


Competency 


based or 


Required 


Decision Or Tusk 


THi Batteries 


T»u 


Continuum Tests 


Tests (A,B,C) 


C^iirrii'ii liim P\>;4li>Hf ii^n 

v.- Ul t U lU III VCllUttl ll'l 1 


2 tx) 










\ • J*. I 








*7(uuh>iii v^ioaA «^ siKi 111 ivi 11.1 


2.49 


2.24 


J in 


2 27 




(.71) 


(.79) 


(.96) 




Teacher Hvaluaiion 


1.69 


1.81 


1.94 


1.81 




(.72) 


(.74) 


(.81) 




AlliHTating Funds 


1.85 


1.85 


1.71 


1.80 




(83) 


1.91) 


(.86) 




Student Promotion 


2J9 


2.49 


2.27 


2.31 




(.83) 


(1 04) 


(.95) 




Public Communicdiion 


2.(>9 


2.36 


2.33 


2.46 




(.78) 


(.96) 


(I. 00) 




Communicating to Parents 


2.80 


2.74 


2.51 


2.68 




(.5h) 


(.94) 


(.84) 




Reporting to District 


3.03 


2.94 


2.74 


2.90 




(.73) 


(i.09) 


(.94) 








LOWER SES 




r *iiiTii*ii liiT*i r* v;! liinf if \n 


. wo 


3.18 


3.08 


1 1 




I . / O 1 


(.59) 


(.83) 






2.68 


2.67 


2.59 


7 ftS 




(.79) 


(1.03) 


(.94) 




Teacher Evaluation 


1.95 


1.74 


1.94 


1 88 




(84) 


(.72) 


(1.03) 




Allocating Funds 


2 (X) 


2.45 


2 18 


2.2P 




(.7S) 


(.92) 


( 1 (XJ) 




Student Promotion 


2.45 


2,39 


2.17 


2..M 




(.93) 


(.99) 


(.84) 




Public Communication 


2.H4 


2.93 


2.59 


2.79 




( 9()) 


(.97) 


(1 04) 




Comn^unicating lo Parents 


2 96 


3.26 


3 26 


3.16 




( 57) 


(.78) 


(.51) 




Reporting to District 


3.11 


3.28 


3.11 


3.17 




1 65) 


( 6i) 


(.93) 





*i4-pt)im scale. 4 OucmI Inipintancc \ * Unimptutani or not used) 



41 




Table 12* Classroom Uses of Test Results and Other Information: 
(Percentages of ELEMENTARY teachers surveyed reporting use of this information 
as crucial or important for the specified purpose) 



Source/Kind or Information 



Piannln^ TeachlitK 
at Be^inninK of 
vSchool Year 



Initial (irouping 
or Placement of 
Students 



Changing a vStudent 
from One Group or 
Curriculum to Another 



Reading 



Math 



Reading Math 



Reading 



Math 



Deciding on 
Student<»* Re- 
port Card (trades 



Reading Math 



Previous teachers* conimcnls. 
rcporls grade.s 

Students' standard i/ed test stores 

Students' scores on district con- 
tinuum or ni minium competency tests 

My previous leaching experience 

Results of tests iucludcd with 
curriculum being used 

Results of other special 
placement tests 

Results of special tests devclo^Kd 
or chosen by m> schix)! 

Results of tests I make up 

My own observations and siudcnis* 
classrtH)in work 



57 

.S7 
51 

X 



.S2 

M 
47 

X 



57 
50 

X 

7K 



96 



55 

S2 
45 

X 

67 
50 



97 



45 
\ 



56 

7K 
99 



5.H 
39 

X 

82 



85 
99 



17 
20 

X 

75 



92 
98 



16 
18 

X 

77 



95 
98 



ERIC 



♦ 1 



Tabte 13. Classroom Uses of Test Results and Other Information: 
(Percentages of SECONDARY teachers surveyed reporting use of this information 
as crucial or important for the specified purpose) 



Source/Kind of Information 



Planning Teaching 
at Beginning of 
School Year 



Initial Grouping 
or Placemen; of 
Students 



Changing a Student 
from One Group or 
Curriculum to Another 



English 



Math 



Kngii^h Math 



English 



Math 



Deciding on 
Students' Re- 
port Card (trades 



English Math 



Previous icachcrs' cumnicnls, 
rcpi..(s. jiradcN 

Sludenls* slandardi/cd lest scures 

Sludcnls* scurcs nn district con- 
tinuum or minimum competency lesis 

My prcviims teaching experience 

Results ot tests cndudcd wilh 
cumculuin hemg used 

Results ol other special 
placement tests 

Results ot s[Kcial tests developed 
or chosen by my schmii 

Results ol tests I make up 

My own observations and students' 
cIassriH)ni work 



2H 

47 
4K 

X 



29 

29 
30 

97 

X 



34 

49 
47 

X 

4.S 
42 



X7 
99 



40 

30 
36 

X 

33 



77 
93 



62 
53 

X 

5S 



92 
99 



39 
3(1 

X 

43 



31 

91 

97 



12 
9 

X 

44 



99 
99 



5 

X 

3i 



34 

99 
95 



ERIC 



•13 



38 



UsiNCi AsSt SSMHNT RBSUUIa 



curriculum being used'' play a major role in these ,same tasks. Notice, 
too, that teachers at both levels of schooling count their own, previous 
exjKrience as teachers as their most important source of information for 
planning teaching at the beginning of a school year or semester. 

Mirroring findings for principals, these results show that teachers 
place less emphasis on formal test results than they do upon information 
they gather themselves. Nevertheless, teachers do rate formal test 
scores as somewhat important (Table 14) for initial planning and place- 
ment decisions, as well as in deciding later on to reassign individual 
pupils to a different group or curriculum. Fieldwork indicates that in the 
latter process, teachers frequently treat test results as a general indicator 
of the students' "capabilities.'' Teachers interviewed said that they 
might examine standardized test scores, for example, to see if a poorly 
perrorming student has 'Mow ability" or ''isn't working up to his ability 
level." High school interviewees sometimes explained that they 
checked the test scores printed on their class enrollment lists (as one put 
it) **to be sure they really belong in this class." 

The data in Tables 12, 13, and 14 hint that teachers rarely rely on only 
one type of assessment information as they go about making 
instructional decisions. Table 15 confirms that for many this is in fact 
the case. Not only do a good number of teachers routinely consult 
several types of assessment results in reaching each decision listed, 
ihcy consider many as equally crucial or important. This tendency is 
especially common among elementary teachers in the sample. 

Table 16 elaborates on this last point and, in effect, summarizes the 
key points of the discussion in this section. It demonstrates that except 
m planning their leaching at the beginning of a school year or semester, 
uiily small proportions of teachers count one source of as.sessment 
information us more important than all others for the routine tasks 
listed. And oflho.se leachers who do report trusting one kind of infor- 
mation above all the rest, from 86 to 100 percent say that the informa- 
tion they iru.sl most is their own observations and students' classwork 
(or. in the case of planning at the start of the year their previous 
teaching experience). 



41 



Using Assessment Results 



Table 14. Importance of Test Results and Other 
Information In Classroom IVisks and Decisions 



(Mean Ratings by Teachers on a Four-Point Scale)* 







District 












Continuum 










Standardized or Minimum 


Tests 


Teacher* 


Teacher 




Test 


Competency Included with 


Made 


Observations/ 


Decision Area: 


Batteries 




Curriculum 


TesJs 


Opinions 






^ ELbMhNTARY 






Planning teaching at 




2.6() 




— 


3.39 


lX,^[lllini}d. In illC 


{\J. /4) 


(0.79) 








schiH)! year 












Initial grouping or 




2.59 


2.91 


3.12 


3.58 


placement of .students 


(0.74) 


(0.82) 


(0.74) 


(0.83) 


(0.78) 


Changing a student from 




2.52 


.V04 


3.12 


3.66 


one group or curriculum 




(0.81) 


(0.74) 


(0.84) 


(0.72) 


u) anoiner. proviuing 












remedial or accelerated 












work 












Deciding on report card 


1 .0*: 


1.81 


2.89 


3 38 


3.69 


grade.s 


(0.76) 


(0.81) 


(0.79) 


(0.74) 


(0.72) 






SECONDARY 






Planning teaching at 


2 22 


2.3H 






3 59 


the beginning of the 


(0.84) 


(0.93) 






(0.60) 


schiH)! year 












Initial grouping or 


2 28 


2.46 


2.48 


3 04 


3.84 


placement of students 


(0,92) 


l0.9K) 


(0.92) 


(0.87) 


(0.85) 


Changing .students from 


2.52 


2 59 


2 67 


3.27 


3.61 


one group or curriculum 


(0.95) 


(0.86) 


(0.93) 


(0.76) 


(0.66) 


to another, provi img 












remedial or accelerated 












work 












Decidmg on rejx^rt card 


1 36 


1.45 


2.29 


3.65 


3 68 


grades 


(0 66) 


(0.64) 


(0.96) 


(0.62) 


(0 65) 



* |4-piMni stale 4 - Crucial Importance - 1 =• Uriimpi^riani or noi uscdl 



ERIC 



40 



UsiNCi ASSliSSMUNT RhSULVS 



Table 15. Proportion of Teachers Who Report 
Considering Many Types of Assessment Information 
Critical/Important for Given Tasks 





Pittnnln|{ 
Teachliif; ot 
&e{;lnnin{; of 
School Year 


Initial 
Groupinf; 
or Placement 
of Students 


Chanisln^ 
(rrouping 
or 

Placement 


Deciding 
on Report 
C ard 
(trades 


Numbc 't Sources ot 
Intorniation Gi\cn in 
Quest ii»n on Survey 


4 


7 


h 




Number ot Sources 
Defined as "Many** 

fnr PiirTV^v£*v i»t 

%\fi t Miin.t^K'^ lit 

this Analysis 




4 


4 


4 


IVoporiion of 
lAvmoUurs I cut hers 
Who Indicaied That 
al l.easl This Many 
Tunciioned as Critical 
and/or Important 
tor the Given Activity 




71'^ 




4l)'/f 


Proportion o{ 

Htiih St iunil Teachers 




4-"^ 


4^'.J 


2nvr 



Table 16, Percentages of Teachers Who Consider ^ne Type of 
Assessment Information To Be More Important T.«an Any Other 







ELKMENTARY 




SECONDARY 


Task or Decision 


of 


% choosing 


of 


% choosing 




Total 


teacher 


Total 


teacher 






observation/Judgment* 




observation/Judgment* 






as most Important 




as most Important 


riannin^ teaching 










at the hej!innin^ t^t 










the schoi^l \ear 


4S 






^7 


Initial grinii>Mig or 










placement ot studies 


2S 


ss 






C lKinj!inj! a stmienl 










Itoni one {.MiUip ot 










ciirtK uliim to 










anolhef 


27 


ss 






Oei uling oi\ sUuleiUs' 










rejM^ri caril jirailes 


21 




It) 


HH) 



* JVUl'lll.lfiCS in lt>l*M Loiunilis .Itl' llu' [HTtl llUjIl'S ol those tlMvlUMS ullO i/lr/ M'li't ! t'tli nf ilSMWItH Ut tl\ 

nu*n imiu'ihnn ihuii all llu <'//ii/\. t.illu r ih.in |vt*.j;nt.ij:cN .iM U'.kIktv m v.unpU* 



Using Asshssmbnt Results 



41 



Fieldwork Interviews Support and Elaborate Survey Findings, 

In the on-site '...^rvievvs, teachers were able to describe with minimal 
constraints how they used test results and information from other as- 
sessment techniques. The purposes they most frequently cited were 
those that constitute their most essential, routine work; deciding what 
to teach and how to teach it to students of different achievement levels; 
keeping track of how students are progressing and how they (the teach- 
ers) can appropriately adjust their teaching; and evaluating and grading 
students on their performance (see Table 1 7). Clearly, these are the day- 
to-day routines of teaching. 

Less frequently, respondents mentioned using assessment results in 
deciding to refer students who need special instruction and to counsel, 
advise, and direct students. These are important teaching responsibil- 
ities, but ones that serve to support or facilitate more basic instructional 
work. 

Interviews also show that, unconstrained by the response format of 
the questionnaire, teachers still indicate that all types of paper-and- 
pencil measures they have available for assessing students' achieve- 
ment, they rely most often on those that they themselves develop. As 
Table 17 shows, teachers freely cited more uses for such assessment 
tools than for any of the other types. The teachers' interviewed univer- 
sally reported that their own perceptions' of children's performance in 
class, or homework, etc. were an important factor in all their judgments 
and decisions; thus the frequency with which these were mentioned is 
not included in Table 17. Fieldwork findings, then, are completely 
consonant with survey results despite differences in the elicitation 
procedures. 

Fieldwork interviews also hnlp to explain some of the reasons why 
teachers feel that the results of one type of test, or even of tests in 
general, cannot be trusted without reference to their everyday experi- 
ence with learners. The following quotations are illustrative. 

• I don't rely heavily on a lot of the test scores because 1 find that 
. . . some students are test takers and others are not . . . some 
students can handle the format, the time limit, (but in many cases) 
students are capable ol' more than the test scores show. 

• I hate to say it. but Fd say about a third of the.se students don't give 
it their best shot. They feel there's nothing in it for them. There's 
no grade for it; there s no use for it — so they don't care. 

• If I see there arc certain kids having trouble I may look at their 
folders and find nut (more) about them. But I try not to be swayed 
by somebody else's judgment ... I may get more out of them by 



ERLC 



47 



42 



UsiNti ASSI.S.SMI-.NI Ri-sui.is 



Table 17. Types of Tests and the Uses of 
Their Results by Teachers (Interview Data) 

(Cells show the number of times the 44 interviewed 
teachers freely cited each use for each type of lest) 

I si:s I Ks I rvPKs 



Hlanninii ln>UiKnnn 


A 
24 


B 
21 


in 


I) 
\ 


1; 
-> 


1- 


(} 

\} 


11 
4 


1 


Total 
«2 


Rolciuil Place n K ill 


\ 


(i 






n 




It 


1 


0 




W uhin C*lus*»ro«un liduip 
in^ \ liult vidua) 
Phu vnicni 


ti 


14 




4 


(1 


s 


4 




I 


01 


HoKIini'. Sludcnis 
Accuiinlahk- lor NVvuk. 
Discipline 


s 






(1 


I) 


0 


i) 


f) 


0 


n 




\2 




17 


s 


1 


1 


1 


1 


0 




Momuuini! SruJeius* 
l*riij:ivsN 


IX 




17 




1 


1 


0 


1 


0 


Si 




U) 




0 


1) 


I 




\ 


0 


{) 




Ititoiinin^ f\irk-{iis 


1) 


1 


0 


(J 


n 


1 


0 


i\ 


(} 




Boaul. oic 


0 


\ 


1 


0 


0 




(} 


0 


0 




Conipai iiii.* (iiniips III 
NukIciUs. Schools, clc 


0 


1 


0 


{) 


0 


1 


1 


i) 


0 




CVmiN in^ Mmmuni) 


1) 


n 


0 


{\ 


0 


0 




1 




1 


lOlM. I SI. C*ll \rU)NS 


int 


74 


(^} 




1 1 




u 


in 


\ 




K\ptKii Sl.lUMlK-nls 


0 


1 


0 


1 


0 


(1 


ID 




•7 


21 



ol Son Use 



KK\: 












) 




h 


IvMvlu i'' < Mlu'i M M^'f \NNiytu»u-nis 


i > 


SLil.vi.ttill/vJ 




( urnv ulunt 1 iiiKiivU'it 


H 


MtiiiriKtrn ( iMiipv'k'iu \ 


n 


S^his^l iVp.dlnKnl lii.uK I v^il 


1 




I 


( •^inilK'K Ltl 1 )M^'nON|lv 







ERIC 



4S 



Using AsshssMi-N r Ri-.sui;is 



43 



what Tm telling them and trying to motivate them to do better than 
they've ever done before. 

• You can't count on a score on one test too heavily, The kid could be 
sick or tired or just not feeling up to doing it that day. Maybe his 
parents had a fight the night before. Maybe he doesn't test well. 

It seems, then, that part of what teachers ''know" is that students can 
vary as test takers and that a variety of situational factors can influence 
students' test performance. Under these circumstances, teachers appear 
to reason, it is better to rely upon a variety of information sources — 
and especially on one's day-to-day experience with students in the 
variety of task and performance contexts that routinely recur in the 
classroom. If principals share this outlook, it may explain why they, 
loo. routinely count on teacher's judgments, opinions, and recommen- 
dations (Tables 9 and 10 above). 



CHAPTER 4 



ADMINISTRATIVE LEADERSHIP: 
MONITORING AND SUPPORTING 
ASSESSMENT 



A growing research literature demonstrates the importance ot district 
and school leadership in the implementation and maintenance of par- 
ticular education innovations, programs, and practices (e.g., Berman 
& McLaughlin, 1977: Bank & Williams, 1981b; Edmonds, 1979), In 
view of these findings, the Test Use in School Study sought to describe 
how, and how regularly, district and school administrators play leader- 
ship roles in local achievement assessment. 

Exploratory fieldwork suggested that administrators' assessment re- 
lated activities tend to fall into four general categories and to include 
both monitoring and supporting functions. The four categories include: 

( 1 ) monitoring testing — checking to see that appropriate assessment 
practices are followed. 

(2) linking test results with instruction ~ reviewing test scores, examin- 
ing their implications for in.struction, communicating these to school 
staff, and monitoring instruction to assure that it attends to the areas 
that scores suggest should be emphasized. 

(3) providing staff development — supporting assessment and test u.se 
by initialing in-service training and informational sessions. 

(4) facilitating routine classroom assessment — initialing and main- 
lainipg technological and organizational arrangements that reduce 
teachers lime on testing. 

Fieldwork also indicated the range of ways in which district and school 
administrators commonly carry out each of these leadership roles. In 
addition, it confirmed that principals usually have much more reliable 
knowledge about their di.strict^ policies and practices than classroom 
teachers do. 

CSE's national survey took these findings into account. Question- 
naires examined the four types of activities listed above: specific ques- 
tions and response choices were generally derived from the fieldwork. 
Questions about the role of di.strict administrators were directed to 
principals, rather than teachers. Both principals and teachers were 
asked to report on certain .school-level leadership activnies. 

O 45 

ERLC 

50 



46 



ADMINlSlKAIIVt-. Ll:Al>l.KSHIP 



l*hc results of this iiuiuirv arc described and discussed below. 

District lesting Programs Are Closely Monitored; Routine 
Classroom Assessment Is Not. 

As Table IS show.s. most principals say that their district administra- 
tors closely monitor districtwide testing programs to be sure they are 
properly carried out. While fewer than half at both levels of schm)ling 
find that such oversight is regular or routine, many others note that it 
occurs "fairly often/* Only 257r of the elementary principals respond- 
ing and 32^/^ of the in secondary principals report that their districts 
rarely or never check up on district testing. 

in sharp contrast, there appears to be very little monitoring of routine 
classroom assessment. Administrators in most schools do not system- 
atically review and critique the tests that their teachers construct. This 
practice is regular or frequent in only KV7f of the elementary principals' 
schiH>ls and in MY/i of the secondary principals'. (Administrative re- 
view of high-schi^ol final examinations, fieldwork suggests, may ac- 
count for the difference in these percentages.) 

Table 18 Monitoring Achievement Testing? 
(Percenta};es of Principals Reporting the 
Regularity of Each Activity)* 

Klfnu-ntiir) Swondar> 
KoutinrU Ofltn Rard> Never Koutiiiel> Oflvn Ryrel) \v\vr 

.ill .|Npv\ In {hi* lilNh 1^ 1 

ri'Niin . ptt)\:t.iiu .jK pii>|vil<> 



4 1 



Ml 



IlliOpK""^ >'l ihi" Ic^l^ tlu v 
(M i riiiqiK'il 

Rvquiii N th.il to.u lKi^ till II 

III '.lu* miHCn lU w'l.lvK'N i»t 
ihcIVNtN tiul NMClUlKlHs 
tlk"\ ItMl IK*I\ lM\|- (I) t u II 

N l.»N\iituniN u" y . till*'. ti MN. 

I ll.lpfiM ti sts. I'ti J 



IvCuI.iiIn «»i t»»uun» l\ no. i>ii .1 N\N(om.nii. . |h.i»»«Jk Immn ... p.iil i»l iiuilitu im»h iilin«>'». * in iH'I n.'>!ii!.ti t'l 
! uihm* hul h.ippi HN Liii|\ ulicn. J in ni»t ii"j.MiKn fmiiim .iiui }>»»p|>t ik i.ut U I iIivn iim h.ipjH tt .11 .ill 



ERIC 



51 



Admin ' STRATI VB Leadership 



47 



Monitoring of teachers' test results, it appears, is only slightly more 
common than the practice of reviewing their tests. A mere third of the 
principals at each level of schooling make it a routine or frequent 
requirement for teachers to turn in students' scores or grades on class- 
room tests and assignments. When they do so, furthermore* it may not 
be for oversight purposes. Fieldwork found one elementary school 
principal who did examine all the reading and math unit-test scores of 
each of his thirteen teachers' pupils in order to **kcep track of how 
things are going and identify problems that should be discussed." 
Elsewhere however, principals gathered students' scores on 
commerical, curriculum-embedded tests on a p-o forma basis and never 
examined them. They were used only to complete forms in compliance 
with evaluation requirements for a special program. In addition, several 
high school adminstrators mentioned collecting students' grades on 
final exams in case there are any complaints from parents about the 
course grades" or **in order to protect the teachers." 

In summary, the results in Table 18 indicate that most school admin- 
istrators do not check up very often on teachers' test designs, scoring 
procedures, or grade distributions. Rather, they appear to trust their 
teachers' professional competence in assessing student achievement. 
The next chapter offers further evidence to support this proposition. 
While few review teachers' assessment procedures often, over 80% of 
the principals studied express confidence that teachers construct tests of 
high quality (Table 25, page 60). All this is especially worthy of note 
given the importance generally accorded the results of teacher-made 
tests and assignments in a wide variety of school and classroom tasks 
(Tables 9 through 14 in Chapter 3), 

Testing And Instruction Are Not Well Linked In Many Districts 
and Schools. 

Evidence in the previous chapter (Tables 11 and 13) indicates that 
both principals and teachers tend to rely heavily on the results of many 
different types of tests as they go about planning curriculum and in- 
struction. Nevertheless, it appears that a good many district and school 
leaders are doing less than they could to facilitate the use of test results 
in the planning and teaching process. 

Tables 19 and 20 below list several very basic activities that district 
and school leaders can undertake toward linking test results with cur- 
riculum and instruction. As a first step (Table 19), districts can arrange 
testing and test scoring such that results are returned to schools at a tmie 
and in a format which permit them to be useful and used. Then, once 



52 



AOMINLSTRATlVh LKADtRSHII* 



the scores arrive in a school (Table 20), adminisiralors ihere can initiate 
meetings with teachers to examine their implications: to identify and 
highlight the subjects and skills that seem to require greater (or h^ss) 
teaching emphasis. If principals' perceptions are correct, however, 
these are consistent, routine procedures in only a minority of settings. 

Over half (54%) of the high-school principals and nearly as many 
elementary-school administrators (47%) say that their districts rarely or 
never return test results in ways that make them useful for curriculum 
planning. Those who find that their districts do so regularly and system- 
atically comprise only small proportions of the sample: 30% at the 
clemenlary level and 18% at the secondary level. 

Most principals claim that they do better in reviewing and analyzing 
the test results with their teachers. Some 84% of those in elementary 
schools respond that they meet with teachers regularly or at least fairly 
often to discuss what test scores mean for instruction. Among the high 
school principr'i, 76% reply in the same way. But if their reports of 
district procedures for returning results are correct, many may be dis- 
cussing scores lhat are outdated or otherwise inappropriate. 
Alternatively, principals may be using different standards to judge what 
is "rouii and ''often'" in describing their own behavior and iheir 

Table 19. Linking Test Results with Instruction: 
District Leadership 
(Percentages of Principals Reporting the Regularity 
of Each Activity)* 



Elementary Secondare 
Roullnelv Often Rarel) Never Roulinel> Often Rarel) Never 



DISIHU 'I ADMIMSIHAnOK 
rotuniN iCNi MorcN m Mich 
a ua\ ihji 1 can um' Ihi'iii lo 
ilcudo on I he skills and 
uMiu-nr ui- need lo vMuk nn 
in mil s*. hiHi! 


















i^hNCivi's ni\ work, u-vu-ws 
st huiil plans, and ni re 
ijuiU'N wntU'ii u'fhulN In 
K' sure the Nihiuii i\ 
cinph.iM/inj! ihc \kilK 
or iiMUeni areas thai 
leM MtireN shiHV need 
cMliph.iMN in out sehihU 




u 




1 i 


2h 






7 


establisheN NjWilK U \l 
^ei>re vmiK tor nur ''vhiH^l 
li> meet 








4fi 


\') 




to 





'Sec tooinj>k- lo lahle IX toi a detailed deMfiplion i»! these response ihoues 



o 

ERIC 



53 



Administrative Leadbrship 



49 



Table 20. Linking Test Results with Instruction: 
School Leadership 
(Percentages of Principals and Teachers Reporting 
the Regularity of Each Activity)'*' 

Elementary S^condnry 
Uoullnely Often Rarely Ne\er Rouiinel) Ofirn Rarely Ne\er 

mSTRICTADMiNlSTRAnON . . 
meets with individual teuth- 
crs. departments, and or grade 
levels to review test scoa^s 
in order to identify skills 
or content dreas ihat need 
extra emphasis/less aiten* 
tton 



tihscrvcN leathers. rcMcus 
ihcir plans, and or re- 
quires written reports it) 
K' sure ihc> are giving 
emphasis to the skills 
content, etc that test 
scores show their students 
need to work on 



considers students' test 
H'ores (n evaluat.ng teat h 
ers and or csiahlishcs test 
scoiv glials tor teachers 
to meet 



4« 



17 



(i:i 



:5 

(14) 



tiy) 



4 



(241 



17 
t.M) 



|7()i 



41) 



40 

1 14 



in 

I 5) 



I? 

i.Mj 



1 10} 



(3li 



50 

i7:i 



•Teachers* response are shoun below principal*. \\\ paa*nthescs 
See ttHnnote to Table IH lor a detailed dcscnpliun o\ these res(Hjn:»o choices 



districts'. Another possibility is that .some principals, viewing the use 
of lest data in instructional planning as a desirable practice, have 
exaggerated the frequency with which it occurs in their schools. 

Teachers' observations (Table 20) support this last hypothesis. In 
general, they assert that meetings to link test information with 
instructional plans take place less regulyrly than principals maintain 
that they do. Assuming the salience of such meetings for teachers is the 
more important (since it is they, after all, who must put any 
instructional plans '^\o effect), it appears that test-based planning 
occurs on a regular, t>eriodic basis in about 379r of the elementary 
teachers' sch(H)ls anH 14% of the high-school teachers . In another 
227r of the former and 197r of the latter, it seems to occur fairly often. 
(Refer to the figures in parentheses in the first line of Table 20.) While 
these percentages are not insubstantial, they do .suggest that many 
school leaders could be deriving greater value from their test scores 
than they currently are. In addition, many leaders at the district level 



50 



ADMlNISTRATlVh LhADi-RSHIP 



could be doing more to facilitate this process by getting scores into 
principals' and teachers' hands in a timely and useful fashion. 

Following through to be sure that test-based curricular and teaching 
plans are implemented is a next, fundamental step in linking testing 
with instruction. Thus, district administrators can visit schools, review 
their plans, and/or require written reports to be sure schools are empha- 
sizing the skills or content areas that test scores show are in need of 
extra attention (Table 19). School administrators can take similar steps 
with classroom teachers (Table 20). Somewhat ironically, it appears 
that both district and school leaders pursue these monitoring procedures 
more regularly than they make test results and their implications acces- 
sible and clear to teachers. (Compare the first and second lines of Table 

19 and Table 20. Once again, note the differences in principals' and 
teachers' reports in Table 20.) 

As yet another step in holding their staff members accountable for 
test-based curricular and instructional plans, administrators can estab- 
lish specific test-score goals for schools and teachers to meet. They can 
also take students' test results into account in teacher evaluation. Table 

20 reveals, however, that these steps are rarely taken at the school level. 
Only 12% of the elementary-school principals and 11% of those in 
.secondary schools say that they regularly or frequently set test-score 
goals for their teachers to meet or consider test results in teacher 
evJiluation. As the next chapter demonstrates principals simply do not 
deem it appropriate to assess teachers, competence on the basis of their 
students' test performance. Most rely on their own observations of 
teachers work in the classroom for this purpose (Table 25, page 60). 

Administrators at the district-level, on the other hand, are more 
likely to set test-score benchmarks for schools. Over all, 36% of the 
principals in elementary schools and 33% of those in high schcwis 
report that their districts do so routinely or often (Table 19.) This 
practice, survey results also suggest, occurs more commonly in dis- 
tricts serving lower socioeconomic groups than in those serving the 
well-to-do. Only 10% of the elementary and secondary principals in the 
highest socioeconomic districts sampled say that they routinely face 
district-established test-score goals. Among those in the lowest socio- 
economic districts sampled, however, the figure is 40%. 

Reviewing all the "routinely" and ''often" columns in Tables 19 
and 20, it is evident that roughly a half to two-thirds of the principals' 
districts and schools manifest some concern that test scores be used in 
curricular planning and instruction. Neverthele.ss, it is also apparent 
that comparatively few administrators routinely take steps to be sure 
that test scores arc readily accessible or routinely review those test 



Administrative LtADiiKSHip 



scores with their faculty members. More, but still relatively small 
percentages of administrators, routinely check to see that test-score- 
based curricular and instructional decisions are actually carried out in 
classrooms. Even fewer choose to hold schools and teachers account- 
able for such decisions by projecting test-score objectives for them to 
achieve. Considering test results in evaluating teachers, moreover, is 
generally avoided. All of this — plus certain apparent inconsistencies 
in principals' reports and the divergence of teachers' and principals' — 
suggests that in most districts and schools the links between testing and 
instruction are very loose indeed, especially at the secondary level. 
Fieldwork during the Test Use in Schools Study supports this finding, as 
does on-site research conducted in other CSE projects (e.g.. Bank & 
Williams, 1981a). 

Teachers Average Seven to Eight Hours a Year In Assessment 
Inservice; Explanations of How To Administer Tests and of Test 
Results Are the Most Common Topics. 

Studies have repeatedly revealed that teachers receive little 
preservice training in testing and mea.surement (e.g., Coffman, 1983: 
Yeh, 1978). This is one reason why their inservice activities in assess- 
ment are of special interest. What is more, it appears thai staff develop- 
ment is a critical factor in districts' establishment of systems to link 
tCbting-evaluating instruction linkage systems (Bank & Williams, 
1981a). Districts' and schools' staff development and informational 
activities in the area of assessing student achievement assessment, 
therefore, were given considerable attention in the CSE national survey. 

Principals' responses show that district-sponsored staff development 
in assessment occurs routinely or often in b\9c of their elementary 
schools and 57% of their high schools. School-supported inservice 
takes place, they collectively report, only slightly less regularly (Table 
21). Allowing teachers extra pay or time away from the classroom to 
help develop tests and related materials appears to be a somewhat less 
widespread practice. Some 41 '/r of the elementary and secondary prin- 
cipals say that it happens routinely or frequently in their districts. 

These figures suggest that most districts and .schools give consider- 
able attention to training teachers in assessment and to a lesser degree, 
utilize teachers' skills in local test development. Once agam. however, 
teachers' reports present a more modest picture. The elementary teach- 
ers surveyed estimate that they had spent, on the average only six hours 
in district or .school-suported in.service training on student assessment 
during ''the last two years." Secondary teachers judge that they had 



ERIC 



52 



ADMlNlSTRAllVh LhaIM-KSHIP 



spent an average of only five hours thus engaged in the same periocl. 
During those two years, meetings to select tests, to construct them, or 
to help formulate testing policies consumed another eight hours for 
elementary teachers and an additional eleven for high-school instruc- 
tors. (See Table 22.) All told, then, it appears that teachers average 
about seven or eight hours a year on all district- and school-sponsored 
inservice activities connected with assessment Of this total, teachers 
spend about two-and-a-half or three hours expanding their assessment 
skills. 

These estimates should be taken as extremely rough, based as they 
are on teachers' recollections over two years. They do, however, put 
principals' estimates of district and school support in perspective. If 
local educational agencies are devoting a great deal of time to develop- 
ing or employing teachers' as.sessment skills, that time is not particular- 
ly salient for most teachers. 

Table 21 • Supporting Assessment Through Staff 
Development and Release Time 
(Percentages of Principals Reporting the 
Regularity of Each Activity)* 



Elementary .Second ur> 

Routinely Often Rarely Ne\er Rou(inel> Often Rarely Noer 



imiHiCI M)MI.\iSlHAn<)S 
|iro\uIi-\ \jK-iikcr\. wiukxhops. 
pnnud HKiJcnal. i*fi in an 
I'lttiri In help tCdi hcrs i-xpand 
.in»! update their skills .ind 
ntuli TNi.iihljnj; in iIk- .in'a iiI 
Mudcni .ivvcvsmcnt 






> 




1 1 


\^ 


\2 


11 


pIll\ll!^'^ released iinic .itu! 
i»i i*\tr.i 1M\ fnr le.u hiTs 
hcJp dexelnp ICM\ I or 
LiicK uluin inaU'riaiN ifnu 
iifi hull hs{s) 








u 


i: 








nil S( IfOOl .\/)MtMSfH \ii<).\ 
biinj;^ ?n ^|KJkl:l^. ixoik^hnps. 
pmUL-d lu.iiiTial. i-ii to help 
u .uhir^ ui^lali- .nul luiihci 
diAch^p JhiMi skills .iiul uikUm 
M.nulini: m Ihc ati'.i iit 


> > 






io 


') 




4^ 


•> 



St-r liH»tm>U*\ In l.ihli' IX lor »i di-ladi'\l dcsiiipiic^n o\ tht'^i* rc^|x»nsc ihotcc^ 



57 



Administrative Lladershif 



53 



Table 22, Teachers' District and School 

In-Service Time on Assessment 
(Reported in Average Number of Hours 
Spent Over the Last Two Years)* 



Klementary Secondary 
Teachers Teachers 



Meetings within my district or schtx)! 






to select or construct tests and or 






to help establish testing pi)ltcy 


8 


11 


District or schix^l supported Inservice 






training on topics related to student 






assessment (testing, other techniques) 


6 


5 



•The figures given here arc munded to the nearesi hour. They are based on teacher>' ro^pon^e^ u^ the rolU>\* 
direction '^Hor each activity below in which >ou have participated, indicate the approximate TOTAL nuinbei ol 
HOUKS yuu spent in the la>l iv^o years." 



Table 23 elaborates on these findings, showing how teachers spend 
their staff development time. For the most part, they attend explana- 
tions of state, district, or school test results; receive directions on how 
to administer required tests. Inservice training that would help teachers 
develop or expand classroom assessment skills, the table shows, tends 
to occur far less frequently. Thus, for instance only about a fifth of the 
teachers in each category report receiving instruction in "how to con- 
struct or select good tests." Information on alternatives to te.iting is 
provided just as rarely for secondary teachers, although some 549f of 
the elementary teachers do report staff development on this topic. 
Training in the use of test results to improve instruction is evidently 
provided for 359^ of the elementary teachers and about 20% of the 
secondary teachers sampled. 

Two other staff development activities on the list can be construed as 
aimed directly at improving students' test results, *'How to tie what is 
taught more closely to the skills, content covered on required tests" and 
"Presentation of published materials designed to prepare students for 
particular tests or to improve test-taking skills." From a quarter to a 
third of the secondary teachers and 40% to 507c of elementary teachers 
have received training in these areas. 

In summary, it appears that districts and schools are doing much less 
than they could to build teachers' competencies in achievement assess- 
ment. This is especially true for high-school teachers. 



ERLC 



54 



ADMINISTRATIVh LliADl-RSHIP 



Table 23. Teachers' Participation in Staff Development 
(Percentages of Teachers Who Report Joining in 
At Least One Session on Various Topics 



During **the last two years") 


lopic 


Klemcnlarv 


Secundarv 
Kngllsh' 


Secondary 
Math 


{ !) Analysis and explanalimi ot sialc. 
disiriu. or schiH)l icsi results 


X4 


70 


dO 


l2) Hiuv to administer tests required b\ 
111) stale, district, and i^r schiu^l 
(procedures to lollow, etc.) 


7H 


54 




(3) How lo inteq^ret and use results ol' 
dilterent types of tests (e.g.. norm- 
rclercnccd and ciUerion-referenced 
tcsis and their applications) 




^s 


34 


(4) Alter native ways (other than tests) 
to assess student achievement 


54 


25 


21 


iS) Hew to tie what is taui!ht more closclv 
to the skills, content covered on 
required tests 


50 


37 


25 


(h) I'lescntalion ot published malenaN 
desij^ned to prepare students lor 
particular tests or to improve 
lesi-takinii skills 


41 


32 


2^> 


(7i !raminj2 in the use ol test results 
to improve instruction 


35 


21 


1^ 


(M Hou to construct or selec* 
jitUKt tests 


20 


23 


IS 



Resources To Facilitate Routine Classroom Assessment Are Not 
Widely Available; But Where They Are Available, They Are Used, 

Survey and ficldwork results discussed in Chapter 2 demonstrate that 
teachers spend considerable time constructing, grading, and recording 
the results of their own tests and assignments. Administrators can help 
teachers reduce this time by initiating and supporting teciinological and 
organizational arrangements that facilitate their testing work. Among 
tho.se that ficldwork found to be available were banks of test items, 
computeri/ed test scoring and analysis and, of course, paid 
paraprofessionals or volunteers to assist teachers in reading and grading 
tests and as.^ignments. In addition, fieldwt)rk suggested that some prin- 
cipals provide special time and support for teachers to develop tests that 
they can use in common with classes in the same grade level, subject, 
etc. 



ADMINISTRATlVii LhaDERSHIP 



While fieldwork and questionnaire piloting indicated that this was a 
reasonable list of resources to investigate in the national survey, survey 
reports show that three of the four are unavailable to large proportions 
of survey respondents (See Table 24). The exception, of course, is 
**oiher teachers with whom I plan and develop tests or other evaluation 
assignments , " but only about a quarter of the elementary-school teach- 
ers and a similar fraction of the secondary-school teachers report taking 
advantage of this resource at least monthly. Some 45% of the secondary 
teachers say that they construct tests with others a few times a year, and 
fieldwork suggests that this often occurs as teachers in the same depart- 
ment conjointly devise mid-term and final exams. 



Table 24. Available Resources for Testing 
(Percentages of Teachers Reporting) 

A . AIL ABLE 



Resource 


NOT 
AVAILABLK 


1 

Not Used 


Used Once 
To Several 
Times/Year 


1 

Used at i^ast 
Once/Month 


Item banks of test questions 


71 


4 


8 


16 Elementary 


upon wnich I drnw in 










niaking jp my tests. 


.SI 


X 


24 


1 6 SeciHidary 


Other teachers with whom 1 plan 


M 


12 


26 


24 lUemenlary 


and develop tests or other 










evaluation assignments. 


21 


10 


4S 


24 .Secondary 


Someone who helps me read. 


69 




4 


21 Hlenieniary 


grade, or correct 








tests and assignments 


70 


5 


4 


2 1 Secondary 


Quu'k. conipuieri/ed 


M 


2 


M) 


4 lilcmenlary 


stH^rmg and analysis 










of tests. 


5X 






4 Secondary 



Computerized test scoring and analysis is used a few limes annually 
by a quarter to a third of both the elementary and secondary teachers 
sampled. Fieldwork indicates that this probably reflects the use of 
.siiialL on-site optical scanning machines for scoring multiple-choice 
and similar '\)bjective'' tests. The number of districts and schools with 
more sophisticated equipment that analyze students* errors is still quite 
small. Some districts, however, have developed computer programs for 
'•-coring unit and chapter tests and simultaneously analyzing individual 
uS* strengths and weaknesses on the skills they cover. 

A final point: in general, nearly all those teachers who have access to 
the resources listed indicate that they use them at least sometime during 
the school year. 



ERIC 



CHAPTER 5 



PRINCIPAL'S AND TEACHER'S 
PERCEPTIONS AND BELIEFS 
ABOUT TESTING 



Previous chapters have focused on what teachers and principals re- 
port that they do in assessing students' achievement, in using assess- 
ment results, and in monitoring and supporting assessment. Here, 
attention shifts from what teachers and principals do in assessment to 
what they perceive, believe, and value as they do it. 

Three complementary objectives shaped CSE's exploration of princi- 
pals' and teachers* viewpoints on testing. One was to elaborate and 
clarify, confirm or disconfirm the values and beliefs suggested by 
principals* and teachers' assessment practices. A second objective was 
to gather their perception: of current testing trends and policies and of 
how these are affecting the schools. In the widespread debate over 
testing and its uses, administrators and teachers in the schools have had 
little direct voice. Here was an opportunity to solicit their views. A 
third objective was to examine relationships between assessment atti- 
tudes and activities: to learn whether certain sets of beliefs seem to co- 
occur with and "explain" certain practices or, on the other hand, 
whether particular practices (in staff development, for example) seem 
to coincide with and account for particular sets of beliefs. Such rela- 
tionships could point the way toward policy and action in local school 
districts and schools. 

Toward these ends, the survey questionnaires presented principals 
and teachers with sixteen statements and asked them to indicate strong 
agreement or agreement, disagreement or strong disagreement with 
each. The statements for principals and those for teachers varied slight- 
ly in phrasing, taking into account differences in their respective roles. 
Nevertheless, both forms of the questionnaire covered identical topics: 
(1) the quality of achievement tests; (2) their value or usefulness: (3) 
effects of testing on the school; (4) the fairness and desirability of 
minimum competency (proficiency) testing: (5) educators* accountabil- 
ity for students' test results; and (6) the importance of testing as a local 
educational issue. 



57 



61 



58 



PbRCliPTlONS AND BbLIHFS AbOUT TESTING 



Respondents' perceptions and beliefs regarding the first four issues 
evolved as especially relevant in later analyses. They are emphasized in 
the discussion below; their relationships with other study findings are 
described in the next chapter. Viewpoints on issues (5) and (6) are 
mentioned brieflv in this one. As in previous sections, information 
from fieldwork interviews serves to supplement and elaborate the sur- 
vey results. 

Principals: A Pro-Testing Perspective 

Testing appears to be a central issue in the professional lives of most 
of the principals studied. Nearly two thirds report that it receives ''a 
good dear' of discussion in their districts. What is more, a substantial 
majority seem to approach their discussions with a highly favorable 
view of tests and testing, (Refer to Table 25.) 

Principals judge that the quality of tests is generally high. Eighty 
percent or more of those who responded apply this judgment to tests 
that accompany published curriculum materials, to tests developed by 
their districts, and to the tests constructed by the teachers in their 
schools. A similar proportion (82%) concludes that standardized tests 
are fair for most students. 

Unfortunately neither the survey nor project fieldwork was able to 
explore exactly how principals arrive at these judgments. Principals' 
broad confidence in lest quality, however, is worthy of note in itself. It 
can help to explain their regular use of test results in a variety of routine 
tasks (Tables 9 and 10, pages 31 and 32), as well their general belief 
in testing's validity and value (discussed next), i.aier, as the policy 
implications of this study are examined, principals' confidence in test 
quality will be cited again. 

Most principals see testing as valid and valuable. Principals, we 
have seen, rely on test scores most heavily for planning curriculum and 
(especially) for reporting school achievement to district officials, par- 
ents, and the general public. These uses can follow from district 
directives, public expectations, and other forces beyond principals' 
control. Be that as it may, most principals seem comfortable using test 
results in those ways. On the whole, they believe test scores accurately 
rcOect their schools' performance, and they generally see testing as an 
asset. 

By an overwhelming majority, principals reject the view that schools 
should not be held accountable for their students' test results. (Sec 
Table 25. ^'Accountability."). They appear to accept that it is what 
goes on in school — and not, tor instance, students' native abilities. 



ERLC 



l>2 



PURCliPTlONS AND BlLILFS AbOUT TfcS HNG 



59 



their parents* support, or the conununity environment — that is primar- 
ily responsible for how students do on tests. 

In a consistent set of responses, two thirds of the elementary-school 
principals and three quarters of those in high scools find that test scores 
provide **a fairly good index of how a schrn)! is doing." As one 
California high-school principal explained in an interview: 

I'm not a believer that test scores tell all. Many factors contribute to 
outcomes and they're not all revealed in test scores. But they are impor- 
tant indices. TheyVe something we should take a look at among other 
data . . . Like with our (standardized test and state assessment] results, I 
keep a running tally of the means and of where we are, so that I'm aware 
of the progress and of where our students may have had some difficulty. 
And we share that with the math and English departments, particularly* 
and with the rest of the statf. 

At an Iowa high school, the principal volunteered a similar perspec- 
tive: 

I don't know that test results per se would change specific instruction 
much, but it year after year after year we had a department rating low, we 
would certainly look at several things. We'd want to talk to the people (in 
that department] to see what the problem is. 

These remarks reflect a qualified, or cautious, acceptance of test 
scores as * 'indices" of school pe.formance. Fieldwork suggests that 
such a stance is common among both elementary- and high-school 
administrators: It may well underlie their questionnaire response. 

While most principals maintain that test results reflect overall school 
performance, many fewer believe that individual teachers can be held 
accountable for them. Only 32% of the elementary-school principals 
conclude that **tcst results can be used to evaluate teachers' effective- 
ness or competence." Among the high-school principals responding. 
497r agicc. RecalK however, that principals at both levels claim that 
they in fact place little emphasis on test results in teacher evaluation. In 
general, they tend to trust their own observations of their staffs teach- 
ing skills. (Again, refer to Table 9, page 31 .) In some cases, of course, 
administrators who would use test scores to evaluate teachers literally 
cannot do so. As a result of district policy or an agreement with teach- 
ers' representatives, they never receive classroom-by-classroom break- 
downs of students' test results. But many seem to concur with the views 
of an elementary-school principal who argued: 

You can't evaluate teachers from the olTicc. You need ti) bo in tfie 
classroom and be there trequently. Li)w [test] scores eould mean we're 




Table 25- Principals' Views on Testing and Related Issues 
(N = 221) 

Issues and Items Percentage of Principals 

in Agreement 
Elementary Secondary 

Ti'Mififi As A Local Issue 

TcsUng is an issue thai is discussed a great deal in our district 61 67 

Quality of Tests 

The quality of tests that come with published curriculum materials is generally high . S6 H8 

The quality of our district-developed tests is is generally good X4 H6 

The teachers in my school develop tests of high quality S?S 

Standardized tests are fair for most students 

Value. Usefulness of Tesfinii 

Test scores are a fairly good index of how well a school is dt>ing 68 74 

Student test scores can be used to evaluate teachers' effectiveness or competence .... .^2 4^) 

The pressure that required tesiing exerts upon me and the teacher in my school has a 

generally beneficial effect 6. 0^ 

As a result of minimum competency tcsting(and similar programs), parents are contacting 

the school . . . more frequently or in greater numbers ^^6 r^A 

De.sirahility hairness of Profn iencx iestinji 

Minimum competency/proficiency tests should be required of students for pronu^tion at 

certain grade levels and for high school graduation ^ ''^<'^ "70 

ERIC ^ ^ 



Minimum compeiency/proticiency/t'unciional literacy tests are generally fair tor all 

students \ 58 72 

Effects on the School 

In the last five years, the amount of testing required by our district, state or federal 

program(s) has increased dramatically 68 75 

As a result of testing programs (for minimum competency, etc.). more time is being spent 

on reading/English and math instruction in our school 71 76 

The amount of time that is given to required testing and the preparation for it in my school 

is too great 31 26 

Accoiumihilify For Tesi Results 

Schools should not be held accountable for their students' scores on required or 

standardized achievement tests 37 30 

Schools should not be held accountable for their students' scores on minimum 

competency/proficiency/functional literacy tests 30 21 



62 



Pi;RC'i:m IONS and Bi:iji:i s Abou t Ti'STinci 



noi providing the supplies and maicrials. They eould mean working 
Londilions are a problem. It could be ihc types of students they're 
getting, h could be nie. There are loo many factors to say, 'Uhc scores 
are low, therefore the teacher is ineffective/* 

This way of thinking emphasizes that it is the school as a whole — and 
not the individual classroom teacher — that produces test results, 

hbr many principals the value of testing extends beyond scores and 
their uses to the influence testing has on the school community. Among 
respondents at both levels of .schooling, 627( find that testing require- 
ments exert a beneficial pressure on their teachers and on them. This 
lends support to those contemporary school reformers who suggest that 
slilfer testing requiivnienls will help raise educational standards. 

At least one type of testing requirement .seems to influence many 
parents* behavior. In most states, laws creating minimum competency 
(proficiency) testing also specify that parents be informed of their 
children's results. Districts and schools routinely encourage parents to 
di.scu.ss these results with .school oificials, and some .schedule confer- 
ences with parents who.se children have failed to meet minimum stan- 
dards. A majority of principals responding (about 55Vc) observe that 
these measures have stimulated greater contact between parents and 
schools. Where program requirements are more stringent, i.e., where 
proficiency tests must be passed for promotion to certain grades and/or 
for high-school graduation, the proportion of principals who note in- 
crea.sed parent contact is somewhat greater (slightly over 60%). 

Principats' favor proficiency !cstin\i ft^r promotion ami ^nuluation. 
Some liVA of the siudy\s high-.school principals advocate that .students 
should be required to pass a minimum competency or proficiency test 
for promotion at certain grade levels and for high-school graduation. A 
similar proportion (72% ) finds that tests of this type ''are generally fair 
for all students.'* Principals of elementary schools tend to support both 
views, but by a smaller majority (58Vf). Principals' opinions on these 
issues did not vary substantially according to the requirements now in 
place in their states and districts. 

Here, it is worth noting that CSl: data (C'hoppi.i ct al.. 1981 ) show 
2l)Vf of the nalion\s school districts, serving roughly }y/< of its pupils, 
require proficiency tests for promotion to certain grades and/or for 
high-school graduation. Another of the districts, with about 32% 
of (he nati(Mi\s students, also work under state minimum competency/ 
proficiencx mandaics. Here, however, the tests are u.sed only for diag- 
nostic purposes, not as promotiiMi or graduation prerequisites. The 
remaining districts, with .U^r of the nation's sch(H)l enrollment, oper- 
ate without state-mandated nnninumi competency/proficiency testing, 

ERIC 68 



PhRChmoNs AND Bhi.ihis About Ti.sriNCi 



63 



although a t'cw of these have established their own proficiency rcq ^re- 
ments. State laws have been in flux and the figures may have ch^ .^ged 
somewhat since these data were collected. Nevertheless, the picture 
outlined here should help to put principals' viewpoints in perspective. 

Principals find that more required tCsSting has led to more basic skiUs 
in the curriculum. For 68^ of the elementary-school principals and 
159c of those in high schools, the amount of testing required by their 
district, by their state, or by federal programs has increased dramatical- 
ly 'Mn the last five years" ( 1977-1982). Simultaneously, nearly three 
quarters find that, as a direct result of testing programs, more 
instructional time is being spent in their schools on the basic-skill 
subjects of reading/English and mathematics. Principals* responses on 
these two issues, furthermore, are related at a statistically significant 
level: they tend to be consisted much more often than not. (See Table 
26.) All this suggests that if must principal's perceptions are accurate, a 
recent, marked increase in the amount of required testing has had a 
discernible impact on the curriculum: it has pushed instruction toward 
the basic-skills subjects that required tests emphasize and (probably) 
reduced the teaching-learning time available for other subjects. For the 
most part, however, principals do not find testing requirements trouble- 
some. Fewer than a third say that their schools spend two much time on 
required testing and the preparations for it. (See Table 23.) This seems 
in line with the majority belief that testing exerts a positive influence on 
the schools. 

Teachers: Quaiifled Support For Tests and Testing; 

As teachers received their CSH questionnaires in the early 198()\s. 
social problems such as classroom discipline, school safety, and stu- 
dents* drug and alcohol abuse captured medical attention and preoccu- 
pied many educators. Even compared to such problems, however, 
teachers in a majority of .schools could deflnc testing as an important 
concern (Table 27). just as principals in a majority of districts do. 

More broadly, teachers* responses reflect greater concern about tests, 
testing, and their effects on schools than do principals\ Teachers do 
generally support testing, but that support is le.ss consistent, less over- 
whelming numerically, and (thus) more qualified than the support that 
principals express. (Refer to Table 27 here and throughout.) 

Most teachers afiree that test quality is hi}ih: although by narrower 
majorities than principals, well over KV/t of the teachers responding 
have decided that the content or skills covered by required tests, what- 
ever their type, is similar to the material that they actually teach. Most 



ERLC 



64 



PhRChmoNs AND Bia.U'.i s Auoia Ti-.sj inc 



Table 26. Relationship Between Principals' Responses: 
Increase in Required Testing and More Time on Basic Skills 

Testing Has Led To More 
Instructional Time On The 
Basic Skills 



Required Testing 
has 
Increased 
Dramatically 



Ai:ree 



Disagree 



Agree 



Disagree 



114 


34 


36 


21 



148 



57 



150 55 
X- - 37.83. p < .001 



ERIC 



{b(Y/( — 627f ) also agree that the tests developed in their districts are 
''very good/* Opinion on the quality of conmiereial tests tends io 
divide by grade level. Sonic 597f of the elementary-school teachers find 
commercial tests (such as those that accompany reading and math 
series) *'are usually of high quality," but only W/( of the high-school 
teachers concur. 

Teachers seek tests thai they find fair and H\efid. It is Mupossible to 
know, of course, exactly what criteria the survey respondents use to 
assess test quality. Other aspects of CSH\s Test U.se in Schools Stud\. 
however, provide some clues: they suggest that teachers are nu>st con- 
cerned about the fairness and practical utility of tests 

Results of an earlier CSE questionnaire study of testing in five 
California school districts (Yeh, 1978) were reanalyzed in planning for 
tlie national survey under discu.ssion here. Amonj.' the 25ft elementary* 
school teachers who responded, three criteria .stand out as most impor- 
tant in selecting tests. Listed in deseendim! order of importance, lhe\ 
are ( 1 ) the similarity of test material to v\i a is presented in class: (2) 
clarity of test format: and (}) the ease uith which the test can be 
administered and/or scored. The first two criteria reflect teachers* inter- 
est m test fairness: the third, their desire for practical utility. 



Phrchfiions and Beliefs About Testing 



65 



Concern with these same three features recurs throughout teachers' 
interview comments on test quality. In addition, interviewee's remarks 
reveal a fourth consideration, another dimension of tests' practical 
utility: the degree to which tests yield information that teachers can in 
fact use in their routine teaching tasks. The words of one fourth-grade 
instructor epitomize this conrcrn; 

1 don't feel we need to test, test, test; but if the information is something I 
can use to prescribe instruction, I really don't mind giving it. 

These criteria provide insights into teachers' views of test quality and 
into their test- use practices. 

Teachers in both elementary and high schools tend to count the results 
of their own, self-constructed tests as especially important for routine 
instructional tasks (Tables 12and 13, pages 36 and 37). Asking teachers 
to rate the quality of their own tests seemed unnecessary, but note that 
they do have, from the teacher's perspective, all the qualities of good 
assessment instruments. In making their own tests, teachers can suit 
themselves regarding the fit between what is tested and what is taught. 

They design the format. They determine how easily the test can be 
administered and scored. They also control the timing of the test, when 
the results become available, and other factors that al low the measure to 
serve their everyday, practical needs. 

fn interviews, teachers at the elementary level regularly associate 
these same qualities with the commercial tests with which they work 
most frequently — those that accompany their basal reading and math- 
ematics texts. As one explained: 

The disirict tells us we have to u.se the tests that go with the book — the 
ones you buy from the publisher But we^d all u.se them anyway. They 
match with the skilLs weVe teaching and present things the same way 
(that the biK)k d(K\sl, so they're really convenient. 

This widespread view can help to explain why the majority of 
elementary-grade respondents rale commercial tests as high quality, as 
well as why most rely heavily on the results of commercial, curriculum- 
embedded measures (Table 12, page 36). 

High school teachers iricntion these same criteria in discussing com- 
mercial tests, but they speak of these tests more negatively. With 
greater latitude in selecting their course content, they frequently find 
commercial tests less useful than their counterparts in the lower grades. 




Table 27. Teachers' Views on Testing and Related Issues 
(Elementary Teachers: N = 486) 

(Secondary Teachers: N = 385) 

Percentages of Teachers 
in agreement 

Issues and Items Elementary Secondary 

English Math 

Testing As A Local Is te 

In our school, testing programs are generally held to be much less important than the 

social problems with which we arc concerned 39 32 42 

Quality of Tests 

Commercial tests arc usually of high quality f^^^ 46 46 

The tests developed in our district are very good 62 60 

The content (or skills) on most required tests is very similar to the con' nl (or skills) 

that I teach 72 77 79 

Wiliic. Usefulness of Testinii 

Testing motivates my students to study harder 73 80 93 

rhe pressure that testing exerts on the schools has a generally beneficial effect 4K (>0 72 

As a result of minimum competency testing (and similar programs) pau^its are 

contacting the school . . . more frequently or in greater numbers 53 42 36 

7i) 



I)vsirubilit\\ Fairness oj Proficiency Tesfinii 

Tests of minimum compelcncy/proricicncy should be required ol' all sludcnis tor 

promotion at certain grade levels and tor high school gradual ;on 81 X() 90 

Tests of minimum competency/proficiency are frequently unfair to particular 

students 58 48 35 

Effects on the School 

Recently, I have been spending more teaching lime prcparinji my suidents to take 

required tests 46 41 30 

Tests of minimum competency have affected (would affect) the amount of time I can 

spend teaching subjects or skills that the tests do not cover (^2 62 42 

Basic skills teaching (including remedial work) is now consuming a substantially 

increased proportion ot our school's educational resources 88 84 74 

The proportion of our school's resources now allocated t(^ basic skills teaching is so 

great as to detract from the qualit> of our overall educational program 23 28 21 

AccountahHity For Test Results 

Teachers should not be held accountable for students' scores on standardized 

achievement tests or tests of minimum competency 71 61 61 



ERIC 



68 



Perceptions and Beliefs About Testinc; 



An instructor of senior English spoke for many of his colleagues in 
saying: 

ril occasionally use a [curriculum] kit or package as is, and then if 
there's a test that comes with it. Til use it. But in most units I'm putting 
together materials, combining things from (many sources). The only test 
that will cover it all is the one I make up myself. 

The remarks of a geometry teacher pinpoint another limitation of com- 
mercial tests: 

We rely fairly heavily on the unit post tests we developed as a department 
... We don't use the book *ests. Every one of our courses has perfor- 
mance objectives, and we have designed each unit test to validate to the 
performance objectives for the course. The book tests just don't do that 
. . . Our biggest concern is the validity factor, in terms of our objectives 
for the course. 

It is, perhaps, for reasons such as these that 54% of the secondary 
English and math teachers do not consider commercial tests **of high 
quality. ' ' Such views can also help illuminate why high-school .students 
spend 75% of their total testing time taking teacher-made te? Table 4, 
page 17). 

The broad popularity of district-devek/ed tests (60% — 62% rale 
them ''very good") can also be traced to their fairness, or validity, and 
practical utility. 

That computer-proces.sed data {on our di.strict's objectives-based unit 
te.stsj can really be u.sed with those kids that need help. It dixj.s a better 
job [than the other test.s available) of identifying student.s and .studcnt.s' 
need.s. . . 1 can work on objectives 2, 3, 5 and 9. 

The district [testing] sy.stem is important becau.sc it's the only thing you 
can pass on to other .schools which is meaningful to everbody. There's a 
lot of movement in this town, and the elementary schools, many of them 
use different [lext] .series. 

When district-made tests fail to meet these criteria, however they can be 
ignored or deemed a burden. 

You've already tested your kids with the lest mat comes wiifi the series. 
Then you have to give the district tests, 'cause they require you check off 
the skills on the | record-keeping] card when they complete them. But the 
district te.si doesn't really fit with the way our series lays things out, so 
il's a wa.ste — just more red tape. 




Pi-RCbKnoNs AND Bhlihfs About Testing 



69 



No one uses the [district-construcied| unit reading tests anymore. We 
need to, before we adopted the new series a couple of years ago. But now 
they aren't really valid. 

A sizeable minority of teachers does not find district-developed tests 
''very good"; pwb^ems such as these may explain their judgments. 

Finally, a word or two about teachers' views of requited tests is 
appropriate here. Most survey respondents agree that these measures 
generally cover what they teach (Table 27), but many fewer count their 
scores as of great importance (Tables 12 and 13. pages 36 and 37). 
Interviews offer an explanation for this apparent disciepancy: standard- 
ized and other required tests often fail to meet practical utility criteria, 

The I standardized test required anually in our district! is almo selcss 
in the spring, which is foo bad. because I ftel there is some valuable 
information there, progress and growth. But we get the scores the last 
week of school. 

A high school teacher added: 

You don't get individual students scores on the (state-assessment lestj, 
and the standardized results, they're there in the (cumulaiive recordl 
folders. But I have 150 students, l don't have time lo go down to the 
office and look through all those folders. 

More generally, nearly every teacher interviewed e^^hjcd views of an 
clcmcntary-school teacher in urban New England: 

I think that the children feel good about [a test) and I feel good aoout it if 
I can see where it is actually helping the child and you can put it in 
context. But when you pull it out of the context, oui of the classroom 
teaching situal ion and the actual curriculum, and give a child a test just \o 
rate him nationwide or whatever, that bugs me. It really bothers nic 

This statcnK^n sunimari/es teachers* interest in tests (hat cover what 
they believe they arc teaching and also provide information that teach- 
ers can use in their routine teaching tasks. 

Tciu lwrs va/uv (CMini; as a motivator. Nearly three quarters of the 
elementary teachers and even larger proportions of the secondary in- 
structors (Table 27) claim that testing motivates their students ti> study 
harder. This can be a primary reason for some classr.H)ni assessment. 
As one htgh-school li.nglish teacher explained in her interview: 

I'd like lo climuiaic llic qui/zcs that 1 givoever\ week or m). but I ha\c io 
do It to motivate the students to do the reading. 




70 



Pi KCI IMIONS AM) Bl I II. I S AliiH I Tl.SIINd 



Mo.sl hieh-scluu)! teachers {(^(Y/i in Knglish; IV/r in iiiaihcnKilics) 
al.si^ CDiiciir lhai iho pressure thai lesiiny oxeris on the schools has a 
generally henelicial ctYeci. "Ii\s kind ofniee to gel resulls back/* said 
one who was inicrviewed. *Mt does give \ou nion: ot a feeling ot 
accountabilii) and it's not o\erwhehning." Another added: 

I think ihai \k iiliiii Ihis cil\ there has been a lack ol siandardi/cd tCNtiiig. 
which I think has allov\cd things to go downhill. That is. it \ou diufi 
nieasuiv versus some outside standard you don't kiunv how good or bad 
things aiv going in the s\sieni. iind it can jusl tend lo gci woise. 

Ai the elcnienlary IcveK however, tewer teachers (48V^ ) agree that 
(he pressure generated by testing is benet'icial. One sixih-grade insiruc- 
lor voiced a concern tell b\ many others who were interviewed: 

There's loo lug a tiend to ludge teachers and schools b\ tests. The\ 
jniblisli test results in the papers, and pci^ple use them to ludge teachers 
and uink schools. Ins is ihe danger. |ol lesiingj. using the results m the 
wroiii: way 

liKlccd. most teachers who responded to the sur\c\ (but somewhat 
Icwcr al the seciMulary le\el) assert that teachers should not be lield 
accountable lor students' sciu'cs on siaiuUirdi/cd or minimum 
coiuptencN tcsls, (See Table 27. Accouniahilil\ lor Test Results/') It 
appears, ilicii. lhal man\ leachers (along with their principals) beliexe 
lhat schools, but not individual laeulty members, bear respi^nsibiluv 
lor Uow ICiiriiers |ierUu"m on achievement tests. 

About the same pro|>ortion tit* clementiirv -gi atlc tCiichcrs t^.V f ) as 
princi|ials ) observe tluit |\iivnt-scluu)l contacts have increased as 
a result iM minimum com|vtency testing iUu! similar programs. Onlv a 
ininoiitv ol high school tCiichers agree: 42'^ m {•'.nglish and Mv ( in 
math: as ciMiipaivd to ."^4^^ ol their prmcii\ils. it mav be ttuil parents 
sp-ak moie licquenllv with cenlial oil ice pcisoiinel than with tCiiclieis 
about then liigh-scluu)| students" sci>ies. It nuiv also be. as main le;icli- 
CIS argue, that piUents* active involvcmeni vv nh. dieii childieirs schools 
dmiiiiislics as iheii viuiiigsicrs piocccd thuMigh the ;jiailes. W'hiclu^ver 
the ease, sdihc teachers i^I sccoiulaiA scluuiK lault paiciils Um then kick 
ol CiMiccin An 1 nglisli Dcpiirimcnl cliairpci siui c.ipiurcd the leelmgs 
ol iiKMiv v\lkMi he lepoitcd with Irustration th.al. 

I he piMiU w .IS. llie ieiMNj.tiuu' .irued \o Ic^l I nnnnnum M»ni[Hicnv \ I 
.uul \o .isNUU' elK\MVv cunununiciIuMK wiih ilic possilMliiv ol icmcih 
.tlioii. heh»ic ilic kul i*i»cs out |i»rini:ii Nviiooll We Um\ .i loini Uiui 
^Vv.' seiu mil lo alu»iu I Mi p.ueiils v. Iieic ihc suult^ ius i.iilcil .uui cmiuln i 



ERLC 



Perceptions and Beliefs About Testing 



71 



graduate unless they got it together and passed. It said something like. 

Your child has failed the following competencies** — there was a place 
to check which ones — **and we'd I ike you to come in and discuss this / * 
WelK out of 150 parents only six, I think it was, actually showed up. 

In summary, then, most teachers believe that testing exerts useful 
pressure on students, but their opinions are more divided about testings* 
effects on educators and parents. 

Teachers lwavilyfa\vr proficiency rests as promotion and f^raduation 
requirements, but many doubt that such rests are uniformly fair From 
807r and 90% of the survey respondents (Table 27) believe that all 
students should be required to pass proficiency tests in order to win 
promotion to certain grades and to graduate from high school. 
Interviewees* arguments in support of this position were usually quite 
general. *Mt\s good for the student to know that he has *o pass a certain 
level of competency,*' .said one. Another simply as.serted, '^Students 
who are incompetent should be failed *^At the same time, a majority of 
elementary-.school teachers (58%) and substantial proportions of high- 
school instructors (48% in English; 35% in mathematics) judge that 
minimum competency (proficiency) tests **are frequently unfair to par- 
ticular students." 

Holding both these views simultaneously, as many teachers obvious- 
ly do, dots not necessarily signal inconsistency or an indifference to 
fairness. One can support the general concept of minimum competency 
requirements while doubting the uniform fairness of the particular tests 
now in use. In fact, there is evidence that as teachers actually experi- 
ence minimum competency testing for promotion or graduation, they 
become more concerned about the fairness of the tests, more cautious 
about using them us gatekeeping standards, or both. This is exactly 
what Table 28. below, demon.strates. (Compare teachers' combined, 
mean responses on the fairness and should-be-rcquired-for- 
promotion/graduaiion statements. Tho.se of teachers in slates where 
such requirements are now in effect are significantly lower — sigmfi- 
canily less **pro-competency testing * — than those of teachers "^clsc- 
wherc. ) 

Fieldwork interviews reveal .some of the kinds of experiences that can 
lead teachers toward more circumspect views of the lairness and desii- 
ability of testing for promotion and graduation. 

I wantctl to icll you .iboui die eompeieney tests Isaiil one high-school 
l:nglish teacher in a slate that requires ihem tor promotion and gradua- 
tion!. I'm not happy uith thenu althougli I was on the cointniiice that 
developed them for our disinci. Ihere arc eiizht competencies ilie |luuh 



ERIC 



75 



72 



PlRCBPTIONS and BlLIHKS AbOI'I Tl.STINCi 



school) kids have to pass. . .in one, lhc\ have lo read a bus. irain or 
plane schedule and answer eighi questions about it. When we gave the 
bus schedule, we found that the black kids, ihc Hispanic kids — ihey ride 
the bus more and ihey d'd disiincily beiier on ihat ihan your more 
suburban kids, ihe white kids. Kids here at this .school and others from, 
well where they're more Ukely to lake the bus. they hi 'J better results 
There's clearly cultural bias here. . . Another competency is fill inji out a 
job application, a standard form. |He shows onej. See, now if the 
.student goes over the the Ime here as he fills this in, that's counted as an 
error. So some of this is very trivial, unfair really. . . There are other 
problems, too, and it's difficult figuring out how to resolve them. You 
begin to question whether you can ever come up with a test that\ really 
fair. 



s 



ERLC 



Another teacher of high-school English cited inequities in how hiy 
district handles minimum competency requirements; 

The value of the district competency tests is that thcs are very explicit. 
Nobody has any questions about what's being tested. . . And 1 believe in 
failing a student for being incompetent. But you have to place responsi- 
bility on the students to work their way through |lhe tested skills] step b\ 
step. Here, a sophomore can pass part of the English | competency! 
requirement, fail others, and be passed right through all of his other 
classes and not be able to write a decent letter, not be able to demonstrate 
eighth-grade skills. So now, as a senior, they have special tutoring on 
how to pass the lest and they graduate as a competent .senior. Thai's not 
fair to anyone, either the kid who goes that route or the one who reall> 
masters the skills. 

Thus, while there is among teachers a general enthusiasm for minimum 
competency tests as requirements for promotion and graduation, there 
is also notable concern about the fairness of these tests. This concern is 
significantly greater, and questions about the requirements themselves 
loom larger, where teachers have had to venerate under testing-for- 
promotion/ graduation mandates. 

Most teachers find an increased curricular emphasis on basic skills, 
due at least in part to testin/^, to be acceptable. As reported earlier, the 
vast majority of principals have noted a dramatic increase in required 
testing through recent years. Such testing — usually in the form of 
standardized batteries other minimum competency measures, and state 
assessment i. ^truments — typically places heavier emphasis on basic 
reading, English, and mathematics skills than it places on other areas of 
the curriculum. Citing this fact, critics frequently argue that buigeoning 
testing requirements are ''contracting*' public sch .oPs curricula: forc- 
ing them toward a focus on basic skills at the expense of other .subjects. 



Perceptions and Beliefs About Testing 



73 



l^ble 28. Teachers' Views on the Fairness and Desirability 

of Minimum Competency Testing (MCT), 
By Current State Requirements* 



Mate RHiiilrement 


SECONDARY* 


ELEMENTARY* 


MCT required for proniotlon/grsdudtion. 
state-mandated measure 


3.56 


4.24 


MCT requii^ for promotion/graduation* 
local choice of measure 


3.76 


4.29 


MCT requii^ for diagnosis, 
state-mandated measure 


3.93 


4.38 


MCT required for diagnosis, 
local choice of measure 


4.20 


4.96 


No MCT required 


4.16 


4 79 



'Explanation. The values on ihis scale range (rom 2 (a itrongly ntftmve view of MCT) to 8 (a strongly positive 
viewol MCT). 



The M-ale shows the mean lor average) combined responses of teachers in each categoiy to two survey 
statements: 

(ai "Tests .it minimum (;ompetency'proficieney are frequently unfair to particular students". 1 1 = strongly 

agree. : - agn.'e. 3 = disagree. 4 = strongly disagree), 
(bi "Tests o( minimum competencyproficiency should be required of all students forpramotion and 

for high school graduation"; (1 = strongly disagree. 2 = disagree. 3 = agree. 4 = stnwgly agiee). 

I Differences between groups statistically signtFicant at p < .05 
2. Dilferences betwiten groups significant at p < 01 

Principals concede that testing programs have caused more 
instructional time to be spent on basic skills instruction, but there is 
nothing to suggest that they find this troubling. (Table 26, page 64). 

On the whole, teachers appear to support their principals and to reject 
the critics' argument. Along with the school administrators who re- 
sponded, the teachers surveyed report a marked increase in basic skills 
instruction. Sor.e 88% at the elementary levels, 84% in high-school 
English, and 74% in high-school mathematics agree that "basic skills 
teaching. . .is now consuming a substantially increased proportion of 
our school's educational resources." Only about 25%, however, feel 
that this detracts "from the quality of our overall educational pro- 
gram." (See Table 27.) Furthermore, fewer than half the teachers 
surveyed say that they have spent more time recently preparmg theii 
students for required tests. (At the elementary level, 467r, in .secondary 
English and mathematics, 41% and 30%, respectively). 

The "testing contracts the curriculum" argument does draw some 
support in survey responses, however. Teachers who find they are 
devoting more teaching time to preparing learners for required te.^ts 



ERIC 



77 



74 



Perceptions and Belii-fs About Testinc; 



constitute a sizeable minority, as the figures jusi cited indicate. Repre- 
senting their views, one teacher of grades 3 and 4 said, 

Td like lo cui all the lesimg down lo about half. It seems like everything 
is testing; everything is evaluating, li is so continuing, it's almost 
suffocating. We have no time for any music or art. My kids u.*;ed to learn 
English through writing stories and newspapers. We haw nu time for 
any of that. This is just cui-and-dr>' teaching, drill on tested skills. 

In addition, a great many teachers believe mat minimum competency 
mandates have affected (or would affect, if instituted) the amount of 
time that they can spend teaching skills and subjects not covered by 
these tests (62% in the elementary grades: 62% in high-school English; 
42% in high-school math). Some of the teachers interviewed during 
fieldwork explained how this can happen. Discussing a math competen- 
cy measure her students had to take, a fifth-grade teacher remarked. 

Ahead of time, because the format of the test is so different [from the 
tests my students usually take], we had to have the kids do worksheets 
and so on of that type so that when they did take the test, they were 
familiar with how to go about it. The mechanics of the test. Now. that's 
all time out of the classroom, and 1 couldn't use the scores for a thing. 

A high-school instructor in a course called Consumer Math added: 

Well, see they use this course for kids who have failed the | proficiency] 
tests. So what I do, ! spend the first four weeks doing nothing but 
reviewing the skills and having them take old versions of the test, the 
first month of school, really. Then you see which kids are going to have 
trouble on which of the four tests, then that's what you teach them. 

Still another explanation of minimum competency testing s inlluencc 
on the curriculum was offered by an algebra teacher; 

The first tiine they gave (the state proficiency test, required for diagnos- 
tic purposes only], \ found there were kids having problems with certain 
things, and wc really didn't emphasize those too much. So I went back 
and taught those things, which meant I dropped other units we'd usually 
cover. 

All in all, however, most teachers appear comfortable with the in- 
creased emphasis on basic skills that they find. And while most believe 
that minimum competency requirements affect what they toach, only a 
minority conclude that they must spend more time preparing: students 
for required testing. 



Percbptions and Beliefs About Testing 



75 



Where disthcnvide socioeconomic status (SES) is lower, teachers 
find more emphasis on tested and basic skills. Individual teachers' 
responses on the four survey statements just discussed — those listed 
under **Effects on the School' ' in Table 27 — tend to correlate highly 
with one another. It is reasonable, then, to sum their responses on these 
items to obtain an aggregate indicator of the perceived emphasis on 
tested and basic skills. CSE survey analysts did so in an effort to 
determine whethc* this emphasis varies with environmental factors, 

Districtwide socioeconomic status (or SES) is one feature of the 
school environment that is clearly related to a curricular emphasis on 
tested and basic skills. (See Table 29.) Teachers working in low SES 
communities find more need to stress tested skills in their classrooms 
and more stress on basic skills in their schools than those working in 
higher SES districts. At the elemental^ level, this response trend is 
statistically significant. It appears, then, that testing is driving the 
curriculum in economically disadvantaged areas to a greater extent than 
elsewhere, particularly in elementary schools. 



Table 29. Teachers' Perceptions of the Emphasis 
on Tested and Basic Skills, 
By District Socioeconomic Status (SES)* 



District SES Ranking' 


ELEMENTARY^ 


SECONDARY^ 


High 


10.41 


9.52 


MidiJie 


10.35 


10.13 




11,46 


10.36 



^Expftmatinn. The values nn ihis scale range from 4 (perceive no increased emphaxts on te^ed and basic skills) 
Ui 16 I perceive y^reath- mi reused emphasis on tested and basic skills) 



The scale shows the mean tor average) combined responses of teachers in category to the tour statements listed m 
Tabic 27 under ihc heading, "tftccts On the School" (pages 66 and 67) On each nl ihe four siatcmems, 1 - 
strong ly disagree. 2 - disagree. } - agree. 4 strongly agree 

1 Fhc Orshanskv Index was used as an indicator of schotil disinci siKu^conomiC status 

2 Ditfcrenccs «)mong groups are statist icaily significant at p • .01 
S Diltcrentes among groups arc not statistically significani 

If this is in tact o^^urring, what accounts for it? Is it .simply the belief 
that students from i SES backgrounds need more learning time than 
others on the basic skills that tests cover? Perhaps, but other forces 
seem to bo at work here, too. Principals in lower SES schools report 
paying more attention to test scores than those in higher SES schools. 
They count ihe results of standardized batteries, state assessment mea- 
sures, and Jistrict-objectives-based lest.s / more important for inform- 
ing disirici officials, the public, and parents about school achievement 



76 



Perceptions and Beliefs About Testing 



(Table 1 1 page 35). In addition, districts more often establish specific 
lest-score goals fo' lower SES schools. (Principals in 407f of these 
school report that their districts do so, w^iie only 10% of the principals 
in higher SES schools do. ) At the same time, however, national studies 
repeatedly show that students from lower SES background do less well 
on tests than peers who are more well-off. Thus, in lower SES schools, 
where more students have difficulty on achievement tests, achieve- 
ment-test scores seem to count for more, to be more consequential. This 
can help to explain why, if the teachers responding are correct, educa- 
tors in lower SES schools spend more time and resources than others on 
teaching the material that tests cover. 

In states where minimum competency (proficiency) testing is re- 
quired for promotion and/ or graduation, high-school teachers note a 
significantly greater emphasis on tested and basic skills. To a greater 
extent than secondary teachers elsewhere, they find that more school 
resources are devoted to basic-skills subjects, that they must spend 
more teaching time preparing students for tests, and/or that they must 
focus instruction on the skills that minimum competency tests cover 
(See Table 30.) (For some illustration of these phenomena, review the 
last set of interview comments, quoted on page 74.) 

Table 30« Teachers^ Perceptions of the Emphasis on 
Tested and Basic Skills, By State 
Minimum Competency Testing (MCT) Requirements'^ 



STATE REQUIREMENT ELEMENTARY' SECONDARY^ 



MCT required for promotion/graduation, 
state-mandated measure 


10.81 


11.06 


MCI required tor promotion/graduation, 
local choice of measure 


10. P 


10. 1. 


MCT required lor diagnosis, 
slate* mandated measure 


10.58 




MCT required for diagnosis, 
local choice oi measure 


10.11 


y.4(J 


No MCT required 


10.79 





* Explanation The values on this scale range from 4 (perceive no mreastd emphasis on tested and basK skills) 
to 16 (perceive frvaih increased emphasis on tested and basic skills i 

This scale is the same as ihal m Table #29 See footnote lo Tabic #29 for further explanation 



1 . Differences among groups are not statistical ly significant 

2. Differences among groups arc siatisiicall> significant ai p « 01 



ERIC 



PhRctmoNs AND Bllilfs About testing 



77 



This same response paiiern is not evident among elementary teach- 
ers. Those in states requiring minimum competency tests for promotion 
and/or graduation do not perceive a greater tested-and-basic skills 
thrust in their curricula than teachers operating under other conditions. 
This may be because the potential consequences of strong minimum 
competency requirements are deemed less serious for students in the 
lower grades (no promotion) than for those in high school (no gradua- 
tion). 

Together with the findings regarding SES discuj^sed in the previous 
section, those described here support the hypothesis that where test 
results have greater consequenc .s, testing influences the curriculum 
more. 



Sl 

ERJC 



CHAPTER 6 



THE SCHOOL CONTEXT AND 
CLASSROOM TESTING PRACTICES 



A central goal ofCSE's Test Use in Schools Study was to provide a 
national portrait of assessment practices and attitudes toward student 
achievement testing in schools across the nation. The four previous 
chapters have done that, with illustrations and elaboration from 
ficldwork in a number of schools and school districts, A second goal of 
the study was to address the question, ''Wiiat factors seem to intluence 
the assessment practices that currently exist in our nation's schools?" 
A framework for examining this question was introduced in Chapter 1 . 

One way in which the study tested that framework was by examining 
relationships between testing practices and viewpoints and environ- 
mental features external to the school, e,g.. state and local testing 
requirements, federal and state programs, the nature of the schiH)l 
community and its students. The results of those analyses which pro- 
duced statistically significant results have already been reported. In 
review: 

• Secondary students in states without minimum compciency or 
proficiency testing timi'* upend a significantly greater amount of 
time each year taking classroom achievement tests than students in 
other slates. Sci. vmdur s students where minimum competency test- 
ing !s required i^^r pnonotions and/or graduation spend the least 
amouni of time on classroom achievement testing. 

^ Teacher^ perct.^i^'e a si^inificanily greater emphasis on tested and 
t)asic skills in: u) elementary schools in lower socioeconomic 
areas, and (b) high schools in states that require minimum compe- 
tency (proficiency) testing for promotion at certain grade levels 
and/or for high- school gradu;rion. 

A second way in which the study sought to discover influences on 
testing practRvs and neliefs was by exploring relationships between 
and aniong toM use patterns, attitudes toward testing and various school 
ctnnextual laciors. The latter inciuded leadership practices in monitor- 
ing and supporting testing, ic.^cher training and staff devclopnient, the 



7^ 



80 



School Contlx t and Classroom Tl-stinu 



presence of resources that support classroom testing, the organization 
of curriculum and instruction, and the presence of resources that facili- 
tate instructional differentiation in the classroom. It begins with an 
explanation of the valuables used in th-^ analyses and then goes on to 
describe the relationships uncovered, highlighting those factors which 
were found to be significantly related to testing practices. 

This chapter reports the results of this exploration. The chaptc 
concludes with a conceptual model that integrates all the relational 
analyses conducted, a model that helps to explain patterns of test use in 
the nation's elementary and high schools. 

The Variables In the Analyses 

The analyses investigating relationships between and among test use, 
attitudes toward (or beliefs and perceptions about) testing, and school 
contextual factors employed variables developed by aggregating related 
questionnaire items. These variables and their derivations are described 
b^low. 

Test use variables. Info mation on teachers' use of tests was derived 
from the survey questions described in Chapter 3. Use of four types of 
tests or assessment strategies were examined; 

(1) Use of Formal Testinf{. including; standardized, oorm- 
referenced tests, district objectives-based tests; and minimum 
competency tests; 

(2) Use of Curriculum-Emhedded Tests, including: placcmenl, 
chapter, and unit and other tests *Uhat come with rhe curriculum 
materials I use'*; 

(3) Use of Teacher- Made Tests: 

(4) Use of Teacher Observations and Professional Judgment, in- 
cluding: **my own obseivations and students' classwork/' pre- 
vious teachers* comments and grades, and previous teaching 
experience. 

Teachers who responded to the surv ey rated the importance oi each of 
these types of assessment for four different classroom tasks: planning, 
initial grouping or placement, regrouping or changing placcn. Mit, and 
report card grading. (See Chapter 4 for details.) Thus, to determine 
teachers' overall use of each of the four assessment types listed above, 
their ratings of the importance of that type were summed across all four 
tasks. If, for example, they rated teacher made tests as ''critical * 
(value 4) for all four tasks, they received a *'score** of 16 for use of 



Sc H(H)L Com i;x r and Ci-assroom Tls riNCi 



81 



icacher-madc tests. Or again, if ihcy rated curriculum-embedded tests 
as unimportant ( =^ 1 ) for planning, somewhat important ( = 2) for initial 
grouping of students, and important ( = 3) for re-grouping and grading, 
they received a score of 9, adding the four ratings, for use of curricu- 
lum-embedded tests. In the associational analyses, these scores were 
averaged across groups of teachers. 

Beliefs and perceptions variables. Information on teachers' percep- 
tions and beliefs (or attitudes) about testing were derived from survey 
questions described in Chapter 5. Based on confirmatory factor analy- 
ses, these questions were aggregated to create three '^altitude'' 
variables: 

{ I ) General Attitiule Toward the Quality of Jests: This variable was 
constructed by summing teachers' responses to the statements 
listed in Table 27 under the headings, '^Quality of lests" and 
*'Value, Usefulness of Testing." This provided an overall index 
of the extent to which teachers felt testing was, on the whole, a 
good thing or a bad thing. 

C) Perceived Emphasis on Tested and Basic Skills, This variable 
was constructed by summing teachers' responses to the state- 
ments Msied in Table 27 under the heading, "liffects on the 
School. ' 

(3) Attitude 'Toward Minimum Competency 'Testini{, This variable 
was constructed by summing teachers' responses to the two 
statements listed in Table 27 under the heading ^'Fairness, Desir- 
ability of Minimum Competency Testing." 

The procedures for su/o ning responses in building these scales fol- 
lowed tho.se dcscxibod above in the discussitni of the test use scales. 

School le ulcr^ 'dp m luikinjj test results with instruction. This vari- 
able was buiiJ by siMr.min^^ teachers' respon.ses (not principals*) to the 
thrc.r ^itdtemciit''- li.slcu under ''The School Administration ..." in 
Tible ^0. rhaptci A. I: represents the regularity with which school 
adminis?iatvhs V\\>xt with teachers to examine the curricular and 
insiructional .mpiicctt' ^os of test scores, check to see that teachers 
lollowupo.i these implications inthcir teaching, consider students* test 
results in teacher evaluation, and'or establish • pecific test-.score ; oals 
for teachers \n meet. Below, all this is glossed by the label, "Curricular 
Accountability/' since it reflects the extent to which .schools make 
curricular decisions based on test results and hold teachers accountable 
for these decisions. 



ERLC 



84 



82 



School Contlxt and Ci-assrik)M T4;sriNci 



InjormatUm and traininy^ about testing. Data on this factor came 
from teachers' responses to the items displayed in Table 23, Chapter 4, 
which asked respondents to indicate the kinds of informational and 
instructional activities their districts and schools had provided in the 
a»oaof assessment over the past two years. Exploratory analyses sought 
to identify patterns in teachers' answers that would indicate types of 
staff development emphases, e.g., training programs that focused on 
improving teachers' skills at classroom assessment, in interpreting the 
instructional implications of test scores, on preparing students for test- 
ing, etc. These analyses showed no such patterns, however, in the end, 
this variable was constructed simply by totaling the number of different 
informational or inservice activities in which teachers said ihcy had 
participated. Thus, it may represent the amount of attention paid to 
assessment issues in a teacher^s school as much as it represents the 
depth of instruction teachers have received in tes*':ng. 

Resources that facilitate classroom testing. Data on these resources 
was gathered through the questionnaire items listed in Table 24 of 
Chapter 4. The variable reflects how many of the four resources shown 
there (test item banks, computerized scoring, assistance in correcting 
and grading tests, collegial help In constmcting tests) teachers have 
available and how frequently they use those that they have. 

Resources thai facilitate instructional differentiation in the class- 
room. In a set of questionnaire items not pieviously discussed in this 
paper, teachers were asked to indicate which of the following five 
human and material resources were available to them: (1) an aide, 
paraprofcssional, or volunteer to assist with small group instruction or 
individual work; (2) other teachers with whom to divide up students 
''for extra help"; (3) instructional machines (audiovisual, computer, 
etc.) for independent work; (4) alternative cumculum materials for 
independent work to meet special needs (e.g., self-paced kits, etc.): and 
(5) specialists outside the classroom to whom students can be sent for 
special work. In addition to noting which of these were available to 
them, teachers estimated how frequently they used those that were. 
Thus, this aggregate variable was built by summing the number of the 
five resources a teacher used infrequently (several times a year or less, 
scores as ") and the number used frequently (monthly or more often, 
scored as "2"). 

Student.s^ total testing taking time, in terms of the total number of 
minutes spent annually as reported by teachers, was also considered in 
the context of these variables. Student's time on testing, however, was 
related a) none of them; it is discussed no turther here. 



School Contbxt and Classroom Testing 



83 



Some Relationship Between Testing Practices, Attitudes Toward 
Testing, and School Contextual Factors. 

Correlations were run in a first analysis step to explore relationships 
between the variables just described. Table 31 shows the statistically 
significant results. As noted above, the information-and-trainingabout- 
tcsts factor retlccts how much information and training teachers re- 
ceived through staff development activities in the last two years. It 
seemed reasonable to assume that knowledge about testing an(< about 
how test results can be used in the classroom could facilitate teachers' 
use of tests and/or influence their attitudes toward testing. The correl- 
ative analyses support these hypotheses, particularly at the elementary- 
school level. More training is associated with greater use of formal tests 
for instructional decisionmaking and with more positive attitudes to- 
wards the quality and utility of formal tests. (See Table 31.) Amount 
and diversity of staff development, however, are not related to the use of 
curriculum-embedded or teacher-made tests — probably because the 
kinds of inservice training teachers report usually focus on more formal 
measures, (Chapter 4, Table 23). 

Curricular accountability is also related to test use and attitudes 
toward formal tests. Survey results indicrte that when principals .show 
that they care about test scores — by reviewing them to identify cur- 
ricular wcaknes*;. taking action to assure teachers are emphasizing 
skills that test sc .es show are needed, etc. — teachers rate tests as 
more important in their instructional planning and, simultaneously, feci 
that tests are more valuable and useful. 

Survey findings indicate that resources to facilitate classroom testing 
are not widely available (Table 24, page 55). Nevertheless, the greater 
the number that are available, the greater the importance teachers ac- 
cord to all kinds of assessment results, including their own observation- 
based judgments. 

The use of test results for instructional planning and decisionmaking 
i"-,i>umes that .some action can be taken on the basis of student test scores 
— e.g., providing remediation or advanced work for individual or 
small groups of students. Instructional resources, such as aides, 
instructional machines, and alternative curriculum materials must be 
available to make such actions feasible; where there arc no options, no 
decisions are necessary and likewise test scores indicating the need for 
alternative actions arc superfluous. Survey findings support this logic: 
availability of instructional resources is related to the use of all kinds of 
tests at the elementary school level and to the u.se of formal and cun icu- 
lum embedded tests at the secondary level. 

ERIC SG 



Table 31. Relationships Between Contextual Factors and Testing Practices 



STAFF DEVELOPMENT LEADERSHIP SUPPORT INSTRUCTIONAL RESOU RCES TESTING RESOURCES 

El*m. Sec. Elem. S«c. Eiem. S«c. Etem. Sk. 

I M E M R M E M R M E M R M E I 



Attitude Towafd Quality of Tests 


.318 


.206 


.215 


.230 .— 


206 




















Use of Formal Testing 


.350 


.300 


.198 


.256 .219 .235 


.163 


.333 


.171 


.288 


.207 


.230 


.22y 


.."40 


.126 


.220 


Use of Curriculum Embedded 
Tests 










.156 


.376 


.254 


.391 


.215 


.236 


.232 


.361 


.286 


.237 


Use of Teacher Made Tesis 














.206 


.430 






241 


.362 




.176 



^Sutistically non-sigmfican* (p « 2.05) cofrclaiion^ hnve been indicated vtxih a ' — '. 



L7 



SCHCK)L CONTI-XT AND CLASSROOM TESTING 



85 



A Conceptual Model for Teacher Test Use 

The previous section presented the results of a series of exploratory 
analyses designed to identify possible relationships between school 
contextual factors, attitudes toward testing and test use. This section 
examines these relationships within the framework of a single concep- 
tual model that would examine all the influence on testing embodied in 
the study, i.e., both those in the immediate school context and factors 
external to the school, capturing important policy implications of the 
study. It should be stre sed that while this examination was conducted 
using the techniques of path analysis, the results should not be con- 
strued as anything more than suggestive. Because of the exploratory 
nature of the analyses r.o formal tests of the conceptual model or of 
alternative models were conducted. Only single relationships (paths) 
were tested for statistical significance. Thus, while the mode' presented 
shows significant relationships between the constructs, it shuw^ ^ nly 
one set of relationships, not necessarily the most powerful \ ul!y. 
The remainder of this section is organized by the results of ht ^ath 
analyses for elementary and secondary teachers. 

Elementary Teacher Test Use 

The conceptual model shown in Figures 3 and 4 incorporates the 
results for four different **ouicomes": teachers' use of formal tests, 
curriculum embedded tests, teacher-made tests, and teacher 
observaiions/jufigments. For each of these, we examined the relation- 
.ships between amount of use and the above variables including: atti- 
tudes about quality of tests, perceived emphasis on tested basic skills, 
school leadership in linking tests results with instruction, information 
about tests, testing resources and instructional resources and school 
level socioeconomic status. It was hyp. '- .sized the school SHS would 
act as an exogenous variable in this system of relationships. Further, it 
was thought that school leadership in linking test results with instruc* 
tion would influence the amount of information and training received 
by the teachers. That is, participants who were viewed as emphasizing 
and supporting greater use of tests were also likely to provide and 
require more training on test use. Lastly, it wa.^ assumed that leadership 
and information would relate to attitudes about test quality and basic 
skills press. 

The lenability of these hypotheses can be ascertained from the results 
presented in Figures 3 and 4, displaying results of elementary school 



8S 

ERIC 



86 



St HOOl. Ct)N I l:XT AND Cl ASSROCAl TliSl lNCi 



reading and malhcinalics. The paths drawn in these figures represent 
statistically significant regressions between the variables involve^. 
Paths not drawn in the diagram indicate that the regression was not 
statistically significant.* Looking at the results in these two figures, 
one is struck by the high degree of correspondence. In fact, there is only 
one relationship that was statistically significant in one case and not the 
other. For elementary teachers there is a significant relationship be- 
tween the amount of instructional resources and use of formal tests in 
math while that relationship does not appear for reading. With that 
exception the two models are identical in their structure indicating that 
the same mechanism is likely to be operating regardless of subject 
matter. 

Beyond the concordance between the two cases there are several 
interesting features of the model. First of alK the influence of SHS on 
the use of tests in decisionmaking is moderated through variables which 
are directly under administrative control. Specifically, the amount of 
information and training about tests and the degree to which the princi- 
pal exercises leadership and holds teachers accountable, moderate the 
intluence of SES on test use. Thus, regardless of a schooPs SES it 
appears possible through administrative steps to intluence a teacherVs 
use of tests. This administrative effect appears to be manifested th ciugh 
the attitudes that teachers have about tests. In particular, teachers seem 
to have better attitudes about the quality of tests in schools where there 
is more information and training about tests. Additionally, teachers 
who are more informed about tests and are held more accountable by 
the principal for test results also perceive a greater emphasis on basic 
skills and basic skills tests. These characteristics translate into greater 
use of formal testing in making classroom decisions. 

The use of formal tests is also a function of the amount of resources 
available to the teacher. The greater ::mount of testing resources (e.g.. 
scanning, scoring help) the greater the use of formal testing. Further 
increased instructional resources leads to greater use of forinal testing. 
The hypothesis here is that resources permit instructional jf^Tnatives 
or options. The existence of these options requires greatei 'sion- 
making on the pari of teachers and hence greater use of test results. 

The use of curriculum embedded test - .seems to be a function of the 
amount of both testing and instructional rcsourct^s as well as the teach- 
er \s perception of the quality of tests. In situations where the teacher 



*'A p!t)babilit\ level i)t .(^5 was used in these anaKscs io dclerinine statistical signitr- 
eaiK'o. The smjile eseepluni io this enlcria has been ruUccI in the l-i aires. I'he basis tur 
this e\ecpt!iMi was the cxploratiuy nature ot the analysis \Ahieh izencrallv unnlves 
snnievvhat mow Ictnenl cnlenal lor exanuaatuui 41I nsulis 

ERLC o.v 



School Con n:x r and Classr(K)M Ti-si ing 



87 



feels that the commercial tests arc well made, they will be more likely 
be employed in decision-making. Again, the role of resources seems to 
be one of making testing or test use more feasible. 

It is interesting to see in the results of these analyses that the only 
contributing factors to the use of teacher-made tests and teacher judg- 
ment are the resources available to the teacher. This finding may reflect 
the pervasive use by teachers of these mechanisms for arriving at 
instructional decisions almost independent of other sources of informa- 
tion. That is, there may be a feelmg on the part of teachers that their 
own tests and judgments are more suitable for decisions than more 
formal measures regardless of their attitudes and training about these 
latter tests. 

In sum, the model portrayed in Figures 3 and 4 shows that the u.se of 
lest information in teacher decisionmaking can be influenced by admin- 
istrative action. In particular, the administrator can require greater 
accountability on the part of the teachers, provide more information and 
training about tests and, if feasible, supply additional testing and^or 
instructional resources. Each of these actions appears to positively 
Influence the use of one or more types of test use. 



no 



ELEMENTARY READING 



995 






23 


Instructional 













^■^■""^^'^ 21 



Use of Teacher Observations 
Professional Judgments 



Testing 
Resources 



School 
SES 



Total 12 
Information and 
Training About Tests 



©3 s^21 

.39 



.15 



.32 



Curricular 
Accountability 
Total 14 




.866 



e? 



21 



Attitudes About 
Quality of Tests 
No #1 



Perceptions of 
Basic Skills Press 
No #2 



1 



04 



.966 



Figure 3 



\0 



ERIC 



e in R€ 



Use of Teacher«Made 
Tests 




Use of Curriculum 
Tests 



Use of Formal 
Tests 



.880 



Conceptual Model for Elementary School Teachers* Test Use in Reading 

*R«fXKt«d vAluM corr«tpond to Stirvl«r4i/«(f path co«thci*n(s met fKW« stalifttic«lty stgoiticanl (p * OS) 
••Rdporttd cotlficttnt iUtitt<c«iiy siondiunt (p * Oe) 



.930 



e» 



.945 

^ 67 



School 
SES 



ELEMENTARY MATHEMATICS 





^ 


Use of Teacher Observations 
Professional Judamfinte 


^ -838 ^ 


Resources 




< e» 






(Total 12) 
Information and 
Training About Testa 



.15 



.32 



(Total 14) 
School Leadership in 
Linking Test Results 
with Instruction 



I 



866 



Testing 
Resources 



1 



995 



e« 



.39 




(No#1) 
Attitudes About 
Quality of Tests 



(No #2) 
Perceived Emphasis 
on Basic arxl 
Tested Skills 



.966 



.25 



Use of Teacher-Made 
Tests 




Use of Curriculum 
Tests 



, 893. , 
4 



Use of Formal 
Tests 



.862 
4 — e5 



ERLC 



Figure 4 

Conceptual Model for Elementary School Teachers* Test Use in Mathematics* 

*Roporiad vaiu«s ccxrespond (o stAndvditKl path cMHioents irmt s(«t<si<«)ty signilic«nt (p < OSt 
**Report«d coflflioeni statisltcally tignittcant (p < 06) 



92 



School Context and Classroom Testing 



Secondary Tocher Test Use 

Similar analyses were performed for secondary school teachers who 
taught English and mathematics. The results of these analyses are 
presented in Figures 5 and 6. As can be seen from these figures the 
picture at the secondary level is not nearly as clear nor consistent. In 
fact, there are few statistically significant relationships for the English 
teachers and those that do exist are for the use of curriculum tests. 
Because of the paucity of relationships for these teachers it would be 
hazardous to attempt to interpret them or the model. 

The results for mathematics teachers are somewhat more encourag- 
ing though still not as conceptually appealing as the elementary school 
results. The results in Figure 5 show that a somewhat similar mecha- 
nism to that found in elementary schools may be operating for the use of 
formal and curriculum tests. That is, it appears that administrative 
leadership, information about tests, and testing resources are all influ- 
encing the use of formal and curricular tests. What appears to be 
different at this level, however, is the greater direct role of school 
leadership in linking test results with instruction. This variable has 
strong direct relationships to both use variables. Further, this variable, 
rather than information about tests, seems to relate to teachers' atti- 
tudes about test quality. Thus, these results seem to point to a greater 
direct role for the principal at the secondary school than at the lower 
grade levels. It should be noted, however, that the same constellation of 
factors are evolved, it is just their relative priorities and interrelation- 
ships that are different. Therefore, from a prescriptive point of view, 
working on the three variables of information and training about tests, 
school leadership, and testing resources seem most likely to pay off in 
terms of greater teacher use of formal and commercial tests. 

In summary, these analyses have explored a possible prescriptive 
model for teachers' use of different types of information in their 
decisionmaking. While the results showed some disparity between 
elementary and secondary teachers, particularly for secondary English 
teachers, some definite similarities were found. In particular, it appears 
that three policy relevant and administratively manipulatable variables 
are related to increased use of formal and commercial tests. These three 
variables are the amount of curricular accountability operating in the 
school, the amount of information and training given to the teachers 
about tests, and the amount of testing-related resources made available 
to the teacher. It would appear that if increased use of formal test results 
were considered a desirable goal, increased emphasis should be placed 
in the three areas mentioned above. 



o 

ERIC 



SECONDARY READING 



Instructional 
Resources 



Use of Teacher Observations 
Professional Judgments 




Figure 5 

Conceptual Model for Secondary School English Teachers' Tp^ot Use* 

•RepoMe<J valuer towt one to siandardi/ed p.i"' r tK'H«iieni& ihai wo/e stai>&ii( aiiy s»qnii»carii ip « dbi 



SECONDARY MATHEMATICS 



School 
SES 



Instructional 
Resources 



Use of Teacher Observations 
Professional Judgments 



Testing 
Resources 



.891 



Information and 
Training About Tests 



.38 



Curricular 
Accountability 



971 




3^ ^ ^ Perceptions of 
" Basic Skills Press 



6 




Use of Teacher-Made 
Tests 



Use of Curriculum 
Tests 



Use of Formal 
Tests 



848 ^ 
^ ~e 



^ -Q^^ e 



ERLC 



Conceptual Model for Secondary School Mathematics Teachers' Test Use* 



CHAPTER 7 

SUMMARY AND IMPLICATIONS: 
ISSUES FOR STATE AND NATIONAL 
POLICY MAKERS 



The findings of CSE's Test Use in Schools Study map the topography 
of basic-skills achievement testing and achievement test use in public 
schools across the Uniteu States. They show patterns of local assess- 
ment practice, demarcate the domain and scale of local leadership in 
assessment, and shade in the tones of local educatois' beliefs about 
testing and its influences on their schools. Through its associational 
analyses, the study also draws some tentative lines between regions on 
this map. That is, i: models some ways in which these within-school 
phenomena appear to be tied functionally to one another and to certain 
conditions beyond the schools. 

This map was constructed, as Chapter 1 explained, with certain 
policy concerns in mind. Thus, it not only describes the landscape of 
public school achievement testing; it also illuminates it such that: (I) 
some issues and concerns particularly important to national and (espe- 
cially) state policy makers stand out in relief; and (2) some answers to 
local policy makers' questions become clearer. 

After an interpretive review of study findings that frames the discus- 
sion of both these sets of policy issues, this chapter outlines three 
findings that fall in the first category listed above — those most appro- 
priately addressed at the .^tate and national levels. One is the matter o*" 
equity in testing, as rai.sed by study findings regarding the impact of 
requir-^d tests. The second is the issue of teacher preparation and Itxal 
test quality, as raised by findings of this and related studies. The third is 
the critical need to explore ways of integrating, aligning, or 
rationalizing assessment such that the same or similar test data can be 
aggregated to address the diverse needs and multiple questions of 
policy makers at various hierarchical levels in the nation's educational 
system, e.g. , in the classroom, the school, the district, the state, and the 
federal government. 

In the next chapter, case study data .'laborate survey results and 
suggest concrete answers to questions ot test utilizations and testing 



O 93 

ERIC Di, 



94 



Summary and Imflk aiions 



etTicicncy at the local level. More specifically, that cliapier demon- 
strates some ways in which district administrators can act to achieve 
collective links between testing and instructional decisionmaking. 

Summary: The Study Reveals T\vo Tiers Of Achievement TestiiiR^ 
Both Under-Utilized. 

A close examination ot'Tcsl Use in Schools Study results confirms 
that there are two tiers or layers of student-achievement assessment in 
our schools today. These are consistently distinguishable from one 
another in their proprietorship, characteristics, and functions. One tier 
of assessment is internal or local to the schools. It is ''owned," and for 
the most part produced, by teachers themselves. This UKal or internal 
tier includes two main types of assessment: (I) the tests, quizzes, and 
other measures that teachers construct and administer in the course of 
their teaching, and (2) the clinical judgments of students' achievement 
thai teachers form as they interact with students and observe their work 
in various classrwm situations day after day. A third kind of measure 
also figures in this tier, but it is especially important for elementary- 
school teachers. These are the tcsls included with commercial curricu- 
lum materials used in the classroom. While these are not produced in 
the school, teachers in the elementary grades are most often invested in 
them. Teachers often have a say in choosing (and choosing how much to 
use) them and the materials they accompany; teachers can time their 
administration and adapt their content to fit the pace and enipha.ses of 
in.struclion. 

The .second tier of assessment is external to the .school: mandated by 
the district, stale, and/or suggested by federal program requirements 
(e.g., for placement in comjKnsatory education programs). Norm- 
referenced, standardized test batteries arc the most common among 
these. Other types of measures used for minimum competency (or 
functional literacy) testing or as part of state as.sessment programs are 
also included here. In some cases, uk), tests constructed or purchased 
by districts and referenced to their curricular objectives fall in this 
second category. Tests of these kinds are developed beyond the schools. 
Their admin i.strat ion is called for primarily to meet organizational 
needs and concerns at higher levels of public-education govcriuuKC. 
l1io.se who work at those levels may have :» se:Kse of ownership in these 
tests: educators in the schools rarely^do. 

l^hese two tiers of as.sessment function quite differently in most 
schools and districts. Teachers and principals rely heavily on the results 
of internal assessment strategics and consider them important as they go 



er|c 7 



Summary and Implications 



95 



about routine instructional planning and decision making. At tho same 
time, they generally treat information from external f':sting as of minor 
importance using it only occasionally and idiosyncratically. These pat- 
terns are obvious in both CSE's fieldwork findings and survey data. 

When teachers were interviewed during pre-suivey fieldwork, they 
discussed all the information they had throughout the year on students' 
academic capabilities, performance, and progress; they described 
whether and how they used that information. Collectively, they cited 
far more uses for the information that came from assessment strategies 
that were local to the school and classroom. (See Table 17, page 42.) 

Teachers surveyed across the nation were asked to rate the impor- 
tance of diverse types of assessment results in four routine, 
decisionmaking tasks. Again, the pre-eminence of the internal tier of 
assessment was apparent. (See Tables 12 and 13, pages 36 and 37,) 
Principals in CSE's national survey were asked to rate how important a 
role data from various sources played in eight regular schooMevcl 
administrative activities. Here, the .separate functions of the two tiers of 
achievement assessment were especially appjirent. Principals reported 
counting internal assessment data more heavily in making 
instructionally relevant decisions, e.g., allocating funds, assigning stu- 
dents, evaluating teachers. But they indicated that results of external 
measures were more important in reporting to those beyond the school, 
e.g., to district administrators and the public. (Review Table 10. page 
32. ) Further evidence of the functional independence of the two tiers of 
.student-achievemcnt assessment appears in Figures 3 through 6 of 
Chapter 6. In general, these figures show two networks of relationships. 
One includes the use of measures external to the school (formal tests); 
the other, internal assessment techniques (teache. made tests, teacher 
observation.; and professional judgments). The use of tests in the exter- 
nal tier varies in response to a chain of factors that usually includes the 
perceived need to emphasize tested and basic skills (''basic skills 
press '); administrators' holding their teachers accountable for lesi- 
.score-basedcurricular decisions ("curricular accountability") attitudes 
about test quality; and information and training about tests. None of 
these factors, however, influence the u.se of the two most widespread 
tyjies of .school-based, or internal, as.sessrnent — teachers' tests, obser- 
vations and judgments. Instead, teachers' useoi the latter is ticd^ nly to 
classroom circumstances: to instructional resources that permit differ- 
entiated instruction to meet students' individual learning needs and 
(less stron^ily) to resources that save time in testing. (The single excc;v 
♦ on is in high-school Fnglish classrooms. Figure 4, where teachers' use 
of local measures does not covary with any of the factors included.) 



ERIC 



Si MMAUV A. si) \\\V\ IC AIIONS 



'Hicse fiiuiingN sugyc^^l cxiLM nal results become more inipoi- 
tant lo teachers only when something or someone mipels or induces 
teachers to treat them as iiumv important. Instructional circumstances 
do not influence teachers' use of these results. On the other hand, the 
results of internal ass^\ssment techniques arc influenced by 
instructional assessment circumstances When classroom conditions 
demand and facilitate closer, more fine-grained evaluation of students' 
IXM'formance, it is their own, local measures that they weigh more 
heavily.* 

Taken together, the research findings just cited show that there arc 
notable quantitative differences in the ways the exlernal and internal 
tiers of assessment are used by educators in the schools. Tliey reveal 
that the results of externally mandated testing serve fewer purposes 
(Table 17) and are not counted as heavily in planning or decisionmaking 
(Tables 9 through 1 3). But fieldwork clearly suggests that there are also 
significant qualitative differences in how the twv) tiers of assessment are 
typically utilized by teachers and principals. 'I'he results of external 
tests are most often examined brielly, casually, and asystemalically. Do 
principals consider the results of strndardi/ed and district-objcclives^ 
based tests in curriculun) evaluation? Table 9 suggests th;U they div But 
interviews indicate that this often means that they merely glance i)ver 
the scores, mention them m a faculty meeting, and point out the areas in 
which ihe school did especially well or pv)orly. (See quotations, page 84 
in diapler 5.) Do teachers use standardized test results in planning? 
Af)parently they do to some extent (Tables i and 2). I'ieldwork sug- 
gests, however, that, more often than not. this means ii once-a-ycar visit 
to the office for a quick look at their students* cumulative files. Are 
standardized test batteries and minmium competency scores consulted 
ni student placcnient? Again Tables y. 12, and ! ^ indicate that they are. 
But visits to schools make clear that they are most often consulted as 
part of an automatic or cursory gate-keeping procedure, l.awor policy 
guidelines direct that students with scores below a cerlain cut-off point 
be placed in a Ciimpensalory program oi icmedial class. Alternatively, 

'Nutc ilial ihc us(.* o\ itnrii ulinn-cnilviklal U-Ms. cDiisulcivii Ikmo .ts niUMii.il mcasinvs. 
lends lo tall Ixivu'on ui ov^^rLip ihc ivlaluMUil nciuoiks Jcsuilvil alnur Ncvcilhc 
loss uso i)| iIk'so icsis |.»c!KMall> tiurv^ lalcs nuMc stri>nL*l\ wiih cl.isstooni iiisiiiK iional 
atKl U'sunji ivst\u.io^ lhan uilh ihc Uulins iha! nilUuMuv c\UMn,|j i^ms 




SUMMAUV AND ImIM.II A HUNS 



97 



as one high-school leather piii ii» dcNerihing a procedure reported by 
many: 

The) gjvo me each WkW siandardi/cd-lcsi score on m> class nosier. If 
one stands oiiu 1 usually check with the counselor to be sure the kid 
should really be assigned to geoineiry. 

Such uses contrast sharply with teachers* recutrent and systematic 
use ot* assessment techtiiques that are local to the classroom and school 
in an on-going process of intructional planning and decisionmaking. 
They contrast markedly with principals' serious consideration of teach- 
ers' advice, recommendations, and grades on teachers assignments in 
making budgetary decisions or next year's class assignments. And they 
certainly do not constitute thorough utilization of external testing data 
in a systematic process of school-wide analysis decisionmaking, or 
planning of curriculum and instruction. 

Why do the two tiers of achievement assessment function in the 
different ways that they commonly do? The reasons are not hard to find. 
The) lie in the interplay of several factors: characteristics of the mea- 
sures themselves, circumstances surrounding their availability, educa- 
tors^ training in assessmenl. and the organization of educational plan- 
ning in schools, districts, and beyond. 

American educational organizations (schools, school districts, etc.) 
have been called "loosel) coupled systems'' (c.f. . Deal. 1979; Meyer 
& Ruwan. 197S; Montjoy & O'Toole. 1979). Schooling in the United 
Stales has been described as "pre-industrial — a collage industry' ' 
(Dawson. !977). And teachers in classrooms have been likened to 
^Mrcel-level bureaucrats'* (e.g.. Wealherly ^ Lipsky. 1977). These 
similes call attention to the relative autonomy of the classroom teacher 
in nuiltileveled decisionmaking hierarchy — a hierarchy in which par- 
ticipants at each level have interests and concerns that only partialis 
overlap, only sometimes coincide. 

^or their pari, teachers routinely do a great deal ot instructional 
planning. They lia\e a major role in planning what to teach (atid or 
empluisi/e) and how lo teach it. in diagnosing mdi\idual students* 
learning needs, and in assuring ihal students are wi>rking at appropriate 
levels in the curriculum. As the school year unfolds. the\ need to 
monitor their students* progress, to consider whether and lunv to adjust 
the pace and emphases ol their teaching, to grade students and inform 
parents of achie\cmcnt-to datc. and so on. To do all this and lio it well, 
teachers need assessnicni Uh>Is wih three basic characteristics: 1 1 ) \W- 
liJffy ■ they must assess what the Icacher bclie\es he o\' she has 
actual l\ taught in a wa\ that seems consonant with the wa\ he ur she has 




Summary and Implications 



laught iu (2) Siiiuihility — their intended purposes musl fit the tasks the 
teacher needs to accomplish (thus teachers seek placement tests lor 
placement, chapter and unit tests for monitoring progress and grading, 
elc); and (3) Immediate Availahility — the teacher must be able to 
employ them whenever it seems appropriate to do so and have the 
results back promptly. In short, the ahsessment tools that teachers need 
imisl be sensitive to local conditions, to the array of particular circum- 
stances in iheir particular classrooms at the momenl. And, in order to 
iLinction throughout the year as the instructional leaders of their 
schools, principals need measures of the same kind. 

Ii is not surprising, then, that both teachers and principals rely heav- 
ily on assessment strategies that are internal to the school and its 
classrooms; teacher-made tests and assignments, teachers' observations 
and clinical judgments, and the adaptable, readily available tests that 
come with the commercial curriculum materials they are using. From 
their points of view, fhese internal measures have all three of the 
characteristics listed above. Externally mandated measures, on the 
other hand, usually do not. They are not designed primarily to provide 
data for routine classroom decision making. The fit between their con- 
tents and format and a particular teacher\s curriculum is problematic. 
Often, their scores are not returned until weeks or months after adminis- 
tration. Often too, the results come back in a format teachers and many 
principals find unfamiliar and/or cumbersome. (See Table 19, page 48.) 
For any or all these reasons, the results of standardized tests, other 
minimum-competency measures, and many di.strict-objectives-based 
tests c:»': .Hccm remote and inelevant to teachers and principals. In 
addition, teachers and principals generally have limited formal training 
in testing and measurement or the use of test data. (vSee Table 23, page 
,^4. ) Innther evidence that supports this claim will be found further on 
in the chapter This also limits the accessibility of external testing data 
to educators in the schools. CSH\s Test Use in Schools Study fieldwork 
found ii -icher and principals voicing these very concerns as drawbacks 
i>f external testing. (See illustrative quotations in Chapter 5, pages 68 
and M). 

Hiif (he venrhiiracreriMii w that make inienial assessment tools ideal 
tt>r use n\ ouhvidual teachers' ami principals' routine work severely 
rc:!ric! their utility for sxsiematic school- and district-wide plannin^i. 
Their ciMitcnt and the timing of their administration is idiosyncratic, 
variable from classroom to classroom. Aggregating the data they pro- 
vide m order to see achie\ement patterns across grade levels, a depart- 
ment or the entire schoi^L therefore, is difficult if not inappropriate and 
impossible. Tins is especially true of leacher-made tests and assign- 
inenis. hut it also otten applies lo tests embedded m texts and other 

101 



Summary and Implk. amons 



99 



commercial materials. (Teachers lime their adminislralion differently; 
they sometimes adapt their contents. The same materials or text series 
are not always used throughout the school.) And while teachers' cumu- 
lative observations and experience-based judgments are valuable 
sources of information, they cannot be readily synthesized into a pre- 
cise, detailed, picture of specific cuiricular or teaching strengths and 
weaknesses across many classrooms or schools. 

It is these problems with local or internal assessment strategies tiiat 
have made standardized, minimum-competency, and special district- 
objectives-based tests attractive to local school districts — and make 
similar measures a virtual necessity for states and othei educational 
agencies. By providing standard and consistent data across settings, 
such tests facilitate comparisons among classrooms, schools, and/or 
districts; they permit year-to-year monitoring of performance. They are 
likely to be more sound psychometrically than teachers own tests; in 
most circumstances they are sufficiently valid to indicate broad patterns 
and trends. Tests of these kinds can take time to administer, score and 
analyze comprehensively, but comprehensiveness is important to dis- 
trict and state planning, especially if data are gathered only annually or 
biannually. Coming full circle, however, the same features that make 
these types of measures useful to districts and larger education agencies 
generally limit their usefullness for teachers and principals. Thus, two 
tiers of achievement testing, largely distinct in their functions, are 
maintained in public schooling. 

As noted earlier, the next chapter will present research-based models 
and guidelines detailing how districts and schools can begin to integrate 
these two tiers of testing use both more fully in planning for 
instructional improvement. The remainder of this chapter, however, 
goes on to examine three important issues that their separation raises of 
state and national policy makers. 

External Assessinent: Study Findings Raise Issues of Equity 

Chapter I explained some of the mechanics through which formal, 
mandated tests (the external tier of assessment) can serve as interven- 
tions, or agents of educational change. (See pages 4 and 5.) Witn this 
"testing as an intervention'' hypothesis in mind, CSE sought to identi- 
fy whether tests required by agencies beyond the school are in fact 
influencing school programs and so students' educational experiences 
and life chances. Among the policy questions underlying the Test Use 
in Schools survey (Chapter I, pages 2 and 3) several addressed the 
influence of minimum competency testing: What are the impacts of 



ERIC 



102 



100 



Summary and Implications 



different kinds of minimum competency programs? Have they affected 
curriculum and instruction? Have they wrought changes in the other 
ways that districts and schcx)ls measure student achievement? A second 
set of policy issues were raised about the formal testing (most often 
standardized, norm-referenced testing) occasioned by the evaluation 
requirements of state and federal education p;ograms: How does such 
testing affect the instructional time of participating students? How does 
it influence the distribution of instructional staff members' energies and 
efforts? 

Answers to these questions have been offered through the preceding 
chapters. Here, it is appropriate to review them and to extrapolate their 
implications. 

Minimum competency testing: three potential sources of education 
inequity. Study findings raise the possibility that differential minimum 
competency or proficiency requirements from state to state (and in some 
states, from district lo district) are generating educational inequities. 

First, there is reason to question whether the tests in use are uniform- 
ly fair. Substantial percentages of teachers, especially in the elementary 
grades and high-school English, think that they are not (Table 25, page 
60). Furthermore, where laws now specify competency tests as prereq- 
uisites for promotion to certain grades and for high-school graduation, 
both elementary and high school teachers are signficantly more inclined 
lo doubt their fairness and the wisdom or using them as gatekeeping 
measures, (Table ?S, page 73.) Put another way, those teachers in the 
best position to know the tests and to judge how well they function in 
sorting minimally competent from incompetent students are the very 
teachers most likely to doubt their equity and desirability. 

Most teachers, of course, are not experts in testing and measurement. 
(See the discussion below, pages 103 to 106.) Their judgments of test 
fairness cannot be taken as definite. Nevertheless, the patterns of their 
.survey responses should be sufficient to stimulate policy makers* con- 
tinued concern about such issues as the instructional validity and cultur- 
al linguistic bias of the proficiency or minimum competency test now in 
use. 

Second, survey results indicate that competency or proficiencv test- 
ing may be generating differences in the frequency of routine classroom 
assessment in high sch ools. This, in turn, may be producing inequities 
in the quality of instruction. In secondary schools where no state- 
mandated competency tests exist, students spend roughly 62 hours a 
year taking English tests and 53 hours a year taking mathematics tests 
(Table 6, page 20). Given that tests in the.se subjects average about a 
half-hour each ( fable 3, page 16), this means that the typical student in 



'1(3 



Summary and Imim.ic'aiions 



101 



these schools takes an Hnglish test on the average ot three limes a wvck 
and a math test on the average ot iwo^to-three limes a week through 37 
weeks of insiruetion eaeh year. Where proficieney or eompetency tests 
are required lor promotion and/or graduation, however, high-school 
students average half-hour tests in ci^ch of these subjects once a week or 
less across the school year* 

No one knows what the optimal of tesf ing is, and some would argue 
that testing should be minimized u> *\savc" class lime for teaching and 
lear»^'ng. A number of studies, however, indicate that frequent monitor- 
ing ;)f student progress is an important charL>cterisiic of more effective 
schools. (See Purkey & Smith, 1982, for i: comprehensive, critical 
review.) Combined with CSb^s survey findings, this suggests that poli^ 
cy makers in both slates and dii^irieis should be concerned ;''h>ui the 
direct and indirect effects of minimum compeiency rcquiremenls on 
local assessment practices. Whether and how these requirements intlih 
ence classroom testing should be closely examined: research should 
explore how often testing should optimally occur But if frequent moni- 
toring of students' progress and prompt feedback on student perfor- 
mance are features of effective teaching, differential comnetency man- 
dates may be contributing to inequities in the quality of students' 
instruct it)n from one slate to another 

Finally and perhaps most importantly, survey results raise the possi- 
bility that minimum competency or proficiency testing programs are 
working to produce staie-to-state differences in the breadth of the cur- 
riculum that students experience, especially at the secondary level. 
There is substantial evidence th .t examinations with important conse- 
quences tend to influence the curriculum in schools where they iuv 
given (e.g.. Oonbach. 1%.^; Linn. 19S3a, b; Madaus & McDonough, 
1979: Tinkleman, 1966). It is hardly surprising, then, that teachers in 
high schools where minimum competenc> tests are required for gradua- 
tion agree, to a significantl\ greater extent than teachers elsewhere, that 
these tests affect the amount of lime thai they can spend leaching 
subjects and skills the tests do not cover that Ihcy have recently been 
spending more teaching time preparmg students for required tests, and 

Mn slates lhal u-qunc icsis loi pnMiuuioii ut.uluatmn aiul maiKkilc llic measure llial 
seliools must usv\ ihe avciai^cs aic a classnnuii l-.nuhsh Icsi 1 .2 Imu's per week and a 
inalhciiuilics tcsl o\kv in e\ei\ vevcn sehiH)l davs In sfalos that require tests hn 
priMUOUon-^eiadualuni l>iit fViiiul ilisitiets {o seleei or desien then o\Ki\ ineavures. the 
average is ,i elasMumn test evei\ seven \o seven-iUul a lKjM sehoo! vla\s in boih l\\)}:{\\b 
and inaiheiUiilKs at the se«.<uu'ai\ le\e! 



104 



102 



Summary and Implications 



that the proportion of their schools' resources allocated to basic skills 
teaching is so great as to detract from the quality of their overall 
educational programs, (Refer to Tiible 30, page 76 in Chapter 5,) 

Some maintain that tests should influence the curriculum. Linn 
(1983a, p. 125), for example, takes the position that 

a test provides the means of making agreed-upon objectives clear and 
precise. An important goal of instruction should be the achievement of 
those objectives as demonstrated by performance on the test. 

Especially in the case of minimum competency in the basic skills, there 
are many who would agree. Educational policy makers and practicing 
educators, they would argue, should establish clearly and precisely the 
basic proficiencies they expect students to have at various milestones in 
their schooling. Instruction should work toward the achievement of 
these minimal objectives, and students should demonstrate that they 
have attained them through test performance. Indeed, it was arguments 
such as this that promoted the passage of minimum competency, profi- 
ciency, or functional literacy testing legislation in over 40 states. 

Few would quarrel with the idea that students should attain minimal 
standards of proficiency in basic skills. But if the perceptions of teach- 
ers surveyed by CSE are accurate, requiring minimum competency 
testing for promotion and graduation may be narrowing the secondary 
curriculum: inducing districts, high schools, and individual teachers to 
emphasize the tested, basic, functional literacy skills at the expense of 
other learning. Thus, those students in states with such competency 
requirements may be limited to learning less about advanced composi- 
tion, and less of the analytic and problem-solving skills that these 
subjects entail ihan students in other states with different requirements 
— and less than they themselves might be learning were their teachers 
not spending class time working to assure that everyone is proficient in 
the mir-*tium, tested skills. Perhaps, then, these students — many of 
whom would certainly pass minimum competency tests in any case — 
are being placed at a disadvantage as compared to students in states 
where proficiency testing is not required or required only for diagnostic 
pu poses. 

Of course secondary students who fail proficiency tests where there 
are graduation requirements are more likely than others to experience 
contracted curriculum. Fieldwork indicates that they are often placed in 
special remedial courses centered on the skills that the tests cover. The 
creation of such courses, however, can mean that fewer sections of more 
advanced courses are available for other students. (States have not 
always provided additional findings for remediation to accompany 

I Co 



Summary and Implications 



103 



competency legislation; districts cannot always hire the extra teachers 
that would be needed to both maintain current course offering and staff 
remedial sections.) And while it is certainly important to make sure 
failing students gain minimal competence in basic reading, writing, and 
mathematics skills, it is also important to recognize that these skills in 
themselves do not open many doors in an increasingly high-technology 
society. 

In short, CSE's survey findings raise serious questions for policy 
makers about the cost-benefit trade-offs of competency testing require- 
ments, as well as questions about their equity. The tests may be unfair 
for many students. They may be reducing the frequency of routine 
classroom testing and (thus) the quality of instruction. They may be 
narrowing the curriculum and, with it, the range of opportunites open to 
many students. These possibilities deserve the attention and investiga- 
tion of all those who shape educational policy at the local, state, and 
national levels. 

Testing for state and federal program requirements: additional equi- 
ty issues. Study findings also suggest that testing conducted to meet the 
evaluation requirements of federal and state educational programs may 
be influencing the educational experiences of low-income students at 
the elementary level. 

According to principals' reports, the re:iults of formal tests carry 
more weight and have greater consequences in schools serving low 
socioeconomic status (SES) neighborho(xis than in those serving higher 
SES communities. In the former, they count far more in such tasks as 
planning curriculum, deciding on students' class assignments, allocat- 
ing school funds, and reporting to the public, district officials and 
..arenis. (Refer to Table 1 1 , page 35.) The role played by formal tests in 
these low-income schools is often mandated or enhanced by the special 
state and federal education programs in which they participate. Stan- 
dardized, norm-referenced scores are commonly used in low-income 
schools, for instance, to establish individual students' qualifications for 
compensatory education programs. Formal testing plays a part, too, in 
the placement of non-English-speaking and limited-English-speaking 
students (many of whom came from lower SES families) in bilingual 
programs. These and similar program^ usually entail evaluation re- 
quiiemenls, and these requirements are frequently met through formal 
testing. Thus, as noted eariier (Chapter 3), federal and stale program 
^ \,iijrements help to make test scores especially salient in the very 
•M. A)ls where more students more often have difficulty doing well on 
\^f'^Vd\ lests. And, to a significantly greater extent than others, teachers 
M lOwer SES schools ad a greater need to spend classroom lime on 



I Of; 



104 



Summary and Implications 



tested, basic skills and preparing students for required tests. They are 
also signficantly more inclined to agree that the measures allocated to 
basic skills instruction are so great as to affect the overall quality of 
their schools' programs. (See Tbble 29, page 75.) 

Certainly all of the emphasis placed on test scores in low SES schools 
cannot be traced to the presence of state and federal program require- 
ments. Nor can the greater attention given tested, basic skills in these 
schools be ascribed solely to their emphasis on test scores. Neverthe- 
less, as noted in the last section, tests with important consequences can 
and do influence curriculum, and it is clear that state and federal 
program requirements do help to make test results more consequential 
in low SES neighborhood schools. Thus, those who establish the re- 
quirements for stale and federal programs should give careful consider- 
ation to the role additional emphasis on test scores may play in narrow- 
ing the curricular opportunities of low-income elementary students, 
which can only add to the disadvantages such students already encoun- 
ter. 

Internal Assessment: Test Use in Schools Study Findings and 
Related Research Raise Issues of l\sacher Preparation and Test 
Quality 

While CSE study findings on the external tier of assessment (or 
formal testing) raise educational equity issues for policy makers, re- 
sults regarding the internal tier of assessment generate concerns about 
test quality and teachers' training in assessment. 

The formal tests mandated by agencies outside the school often play a 
role in major gatekeeping decisions regarding students. But teacher- 
made tests, teachers' daily assignments, and teachers' observations and 
judgments, play at least as great a role in influencing students' educa- 
tional experiences and life chances. Constituting the tier of assessment 
internal to the schools, the results of these techniques are critical in 
schoolwide decisiontnaking. They influence curricular planning, the 
distribution of school funds, and students' assignment to classroom. 
They also ^*\;igh heavily in what schools tell parents about their chil- 
dren's progress. (Review Tables 9 and 10, pages 31 and 32.) They are 
equally important in the classroom. They help to shape teachers' plan- 
ning as the school year begins, significantly affect their placement of 
students in learning groups, and count most in their calculations of 
students' report-card grades (Tables IK 12. 13, and 16 in Chapter 3). 
Thus, the various teacher-designed strategies of achievement assess- 
ment cumulatively shape students' learning environment, academic 



ERIC 1 U 7 



SUMMAHY AND iMin.K'ATlONS 



105 



self-concept, educational status, and (ultimately) their socioeconomic 
opportunities. 

Despite the obvious importance of teachers' tests, assignments, and 
clinical judgments, studies have repeatedly shown that teachers receive 
little pre-service training in assessment. Reviewing some of this litera- 
ture in a recent papei, Coffman (1983) wrote; 

In 1959 May.i reported a study by Noll indicating thai 8.1% of 80 
colleges he had surveyed offered a course in measurenieni, but Ihul only 
I47( of them required one of all teacher education students. Further- 
more, only lOVr of the slates required a course for cerlificaiion. Ten 
years later Siinnei ( I %9) made no mention of any requirement in educa- 
tional measurement in his encyclopedia article on teacher cerlificaiion, 
nor did Burden (1982) thirteen yeats later. \\ seems obvious that only a 
minority of teachers have had any inten.sive training in educational mea- 
surement. 

Recent research also indicc,tes that teachei's remain poorly prepared in 
assessment (Rudman, etal., 1980; Woellncr, 1979: Yeh, et al., 1981). 
And as CSIi's survey indicates, in-service training does little to fill the 
gap. Only about one-fifth of the teachers responding received staff 
development related to selection and construction of good tests or in use 
of lest results to improve instruct iim. 

Very little direct information is available about the quality of teacher- 
developed tests. As the previous paragraph should suggest, however, 
that which is available reveals that teachers lack skill in lest construc- 
tion. Ebel (1967) identified a variety of conmion errors in teachers' test.s 
and urged better training in this area. In a recent review of teacher-made 
tests, Fleming and Chambers (1983) found that teachers write more 
questions of the short-answer kind than of any other type; they rarely 
devise essay examinations. For the most part, too, the tests reviewed 
required students to recall facts and terms. Questions requiring learners 
to translate, apply, or otherwise use knowledge were rare. Furthermore, 
Fleming and Cliambers discovered a **gencral tendency" to omit lest 
directions, to use illegible test copies, and **loomit the point values to 
be assigned to lest questions. This trend suggests that teachers may not 
be visualizing their tests as means for quantifying students' perfor- 
mance as a measure of students' learning, 'fhis trend appears to confirm 
reports in the literature. . .that teachers' knowledge of fundamental 
measurements concepts is limited" (Fleming and Chambers, I98.V p. 

All in all, it seems worth considering just how qualified today's 
teachers are to be dc\elopcrs of the tests that most affccl students* lives. 

lOS 




106 



Summary and Implications 



How cffeclive are teacher-generated tests in revealing the insufficiency 
in individual students' learning? How valid are they as measures of 
students' achieve\nent? How do teachers decide how often to test? How 
skilled are elementary school teachers in analyzing the commercial 
curriculm-embedded tests that they frequently use? Similar questions 
can also be raised about teachers' skills in making observation- and 
interaction-based judgments of children's learning. 

Given the time spent on teacher-constructed tests and given the 
cumulative importance both of these tests and of teachers' judgments in 
classroom and schoolwide decisionmaking, u^achcrs' preparation for 
the role of achievement assessor and their competency in that role needs 
thorough review. And this review deserves the attention of both the 
educational policy and the educational testing communities. 

Toward More Integrated And Rational Assessment Systems 

While they work to examine and (as necessary, rectify equity and 
quality problems in our current system of achievement assessment, 
policy makers will he well advised to explore ways for integrating that 
system and making it more national. 

As the opening of this chapter explained, Test Use in Schools Study 
findings reveal national testing practices which are bifurcated by inter- 
nal and external ^eeds, replete with overlapping requirements at the 
federal, state and locals levels. The result is two systems or tiers of 
tesving which are redunoant and inefficient. Furthermore, survey find- 
ings show thai significant teacher and student time is spent in required 
testing, representing fully half of the testing at the elementary school 
level and one -quarter of the total student testing time at the seondary 
level. This time presumably serves the decisionmaking and account- 
ability needs of policymakers, but (as study results clearly show) serves 
very little the information needs of most principals and teachers and is 
little used by them. Meanwhile, teachers and students spent consider- 
able time taking teacher-made c^jrriculum embedded tests — tests 
which reflect the instructional programs and which serve the classroom 
decisionmaking needs of teachers, but which have little impact in the 
policy arena. In other words, both teachers and policymakers devote 
considerable attention and resources to testing, but view each others' 
efforts as invalid for their purposes. 

While several reasons for this mutual rejection have been described 
above, the fact remains that both teachers, principals, district adminis- 
trators, and other policymakers require information about the same 
phenomena: the academic progress of students and the extent to which 



1 r u 



Summary and Implic ations 



107 



students arc achieving the skills which teachers and schools intend to 
teach. And while the inf /rmation needs of administrators and policy- 
makers may differ from those of teachers and principals — i e., needs 
for generalizable, comparative information vs. ideographic informa- 
tion which is .sensitive to local context — both share the need for 
validity. Yet the validity of achievement tests are valid measures of 
school progress and of accountability only under very special condi- 
tions: where their content matches the specific instructional intentions 
of schools. Ultimately, then, the information needs of teachers and 
policy-makers may be very similar, although their roles and respective 
responsibility implies considerably different levels of specificity and 
periodicity in a.ssessinent. 

Given this similarity in e.s.sential information needs, it should be 
possible to design, in place of overlapping requirements and duplicative 
et forts, multipurpose testing .system^ which can simultaneously serve 
the needs of both policymakers and local educators. Such testing .sys- 
tems migh* provide very detailed and frequent information at the class- 
room level and for the local school site, but be combined and aggregat- 
ed for decisionmaking purpo.ses at other levels. For example, a test 
might provide a teacher with detailed diagnostic information about a 
student's strengths and weaknesses in reading objectives targetted for 
classroom instruction; the results of that test couid also be aggregated 
by in.structional group or class for classroom decisionmaking, be com- 
bined over time for the class and grade for school-level planning and 
then summari/ed for di.strict-level purpo.ses. Given the common acces- 
sibility of micro-computers in schools and their capacities for scoring, 
.storage, retrieval, analysis, reporting, and transmission, the technology 
for implementing such systems is available and feasible for measures 
which are common acro.ss classrooms and schools. Calibrated item 
banks, anchor items, and meta-analysis techniques may .someday per- 
mit more f^cularistic data to be aggregated for decisionmaking at the 
individual, class, .school, district, state and federal levels. The.se po.ssi- 
bilities deserve exploration now, toward a more rational, integrated 
as.sessment system in the future. 

This IS a long-range agenda. In the short-run however, school dis- 
tricts can make a start in making external tests more relevant for .school- 
and clas.sroom-level planning and/or in building internal (classroom) 
tesis that arc useful in .school wide and district wide planning and deci- 
sion-making. The final chapter of this monograph describes .some pro- 
ductive models that districts can follow toward these ends. 



m ' 110 



CHAPTER 8 



DIRECTIONS FOR POLICY AND 
PRACTICE AT THE LOCAL LEVEL: 

LINKING TESTING WITH 
INSTRUCTIONAL PLANNING AND 
IMPROVEMENT 

In explaining the policy orientation underlying the Test Use in 
Schools Study, Chapter 1 listed several questions that are extremely 
common among and urgent for policy makers in local school districts. 
To restate those concerns here: many school districts are expanding 
their own testing programs. From district to district, however, teachers 
may differ in their willingness to administer these tests and to utilize 
their results. Under what conditions, then, are district tests most likely 
to be administered and used? What questions should tests have in order 
to make them attractive and useful from teachers' points of view. How 
can district testing be effectively integrated with other assessment ac- 
tivities? 

This chapter suggests answers to these questions as it addresses a 
somewhat broader one: How can districts and schools make more 
effective u.se of test results in instructional planning and improvement? 
The models and guidelines presented below are derived not only from 
the general survey and detailed fieldwork findings of the Test Use in 
Schools Study, but also from the on-site case studies of a complemen- 
tary CSE project which examined district organization and manage- 
ment strategies for promoting test use (Bank & Williams, 1981a, 
19«lb. 1983). 

Thes ; field studies demonstrate ways in which the utility of both the 
external and internal tiers of assessm ^nt (as described at the outset of 
Chapter 7) can be enhanced in local dc :ision-making and in planning 
for instructional improvement. There are, the data suggest, two ap- 
proaches that districts can follow to accomplish this goal. One 
appproach is to build from the inside out: to construct district tests that 
have the characteristics of internal assessment tools — the validity for 
local curricula, suitability for routine classroom purpo.ses, and immedi- 
ate availability that appeal to teachers — and at the same time provide 




109 



III 



no 



Directions for Policy and PRAcn^icB 



consistent, reliable data thai can be ag£,regaied in ways useful for 
school and district decisionmaking. The second appproach is to build 
from the outside in: to analyze information from externally mandated 
measures currently given in the district and deliver it to schools at times 
and in formats thai maximize its utility in planning for curricular and 
instructional improvement. 

These approaches are not mutually exclusive; both can be followed 
simultaneously. Bui the effectiveness of either depends upon more than 
the proper handling of testing and test scores. It also depends upon 
district systems that structure and support the use of testing information 
in an on-going planning process — systems of a type that are not widely 
present in most districts today. 

On the whole, as has been shown, most districts do not routinely 
return lest results to schools in ways that facilitate their use in decision- 
making. Administrators review scores for the faculty in most schools, 
but rarely on a periodic basis as part of routine procedures. Follow-up to 
assure that teachers are giving attention to the content area, skills, etc., 
that test scores indicate need emphasis is rarely routine, either. (See 
Table 20, page 49.) Survey data show that the majority of teachers are 
instructed in how to administer tests and that they are informed about 
test results. Yet it appears that few receive training in how to link 
teaching and testing or in how to use test results in improving instruc- 
tion. (See Chapter 4, Table 23, page 54.) These aie only some very 
general indicators that not many districts are closing the testing-instruc- 
tion loop with systematic planning mechanisms. They are supported, 
however, by fieldwork from both the Test Use Study and the other CSE 
project mentioned above. Furthermore, even though efforts of the kinds 
investigated in this study are only the most elemental in a district 
lesiing-instruciional decisionmaking linkage system, they can make a 
difference in how teachers view and use testing. Analysis of survey 
data show that where there is more support by district and school 
leaders for the use of test results in planning, and where there is more 
staff development in assessment, teachers have a significantly more 
positive view of testing and its uses, and they also tend to treat the 
results of disirict-objectives-based, standardized, and even minimum- 
competency tests as more important in instructional decisionmaking. 
(Review Table 3K page 84.) With this in mind, discussion turns to 
some ways that districts can create successful links between testing and 
planning for instructional improvement in their schools. 



ERIC 



112 



DiKHCTIONS K)R PoLIC Y AND PraCI ICli 



11! 



Building Links From the Inside Out 

Districts that follow this approach build outwaid from classroom 
assessment needs to those of the school and district. They also build 
from what should be taught to what should be tested. First they con- 
stnict district curricula, then district tests to match. 

Two of the districts studied closely by CSE's projects were especially 
successful in taking this approach. Their slightly different testing- 
instruction linkage systems are useful models for others. 

The Central City Model* 

Located in the rural midwest, Central City School District serves 
about 5,000 students in seven elementary schools, three junior highs, 
and a high school . It has a long history of innovation and com;nitment to 
curriculum development. It also has a group of teachers who pioneered 
use of the high school's main-frame computers (originally purchased 
and used for computer-assisted instruction) in thi: seeing and analysis 
of teacher-made tests. These factors, and an encigetic leader, joined in 
the Cication of Central City's system for linking ^est information with 
instructional planning. 

The test information. Each sumnier in recenl years, the district has 
sponsored curriculum development projects. But while the district initi- 
aled, compensated, and guided, it was teachers wiio did the work. 
Several representatives from the facullies of e.\ch school wen.^ selected 
by their peers to participate. 

Efforts began with the construction of an elementary-grade media (or 
library) skills module and continued Through the developm^^ it of com- 
plete mathematics and sociral science curriculn for the elementary 
grades. Later, the mathematics curriculum was extended through grade 
8 and work began on a reading program. In each case, development was 
done unit by unit in several stages. Fii.>t, teachers decided on 
instructional objectives and .selected and/'or wrote materials and learn- 
ing activities for achieving them. Then, pre- and post-tests referenced 
to the objectives of each unit were designed and ''mastery levels" for 
each objective were specified. Units and accompanying tests were 



*Thc diNlnct narnCN used hero arc tjdonyms. Any rcscmWancc bctwcr. ihosv names 
and those ol aelual dislriels and cxiiniunilics is unintended. 



ERIC 



112 



DiRKCTIONS ^OR POLICY AND PRACTICE 



piloted the next year; objectives, materials, and test itei is were revised 
in light of teachers' criticisms and suggestions. Further revisions incor- 
porating teachers' feedback were made after the units went into general 
use in schools across the district. 

Testing materials were designed such that all the unit tests could be 
scored and analyzed by computer and returned to the teachers in a day or 
two. Results came in the form of a set of easy-to-read sheets, one for 
each student. The sheet listed each objective covered on the test, the 
number of items that measured the particular objective, the number of 
these items the student had correct and incorrect, and whether the 
number correct equaled ''mastery/' At the top of each sheet appeared a 
paragraph that described the types of errors the student had made and 
summarized the types of difficulties the student seemed to be having 
with the skills or content covered. 

In mathematics, the district had selected a sample of items from the 
unit tests and combined these to create mid-year and end-of-the-year 
summary measures given to students in all schools. Teachers received 
summary sheets of the type described above for these tests, too. (The 
district was considering developing similar tests in other subject areas 
once the process of cu'.riculum and test-item revision was considered 
complete.) 

All this applies to the lower grades, but similar developments had 
begun in the high school mathematics department. These were initiated 
by the teachers, who had worked toward common curricula and devis- 
ing computer-scored tests for various courses. In line with a general 
di.strict attitude, other departments were encouraged, but not required, 
to follow this example. 

The end results of the district-wide effort were several: ( I ) curricula 
that were consistent across the district, that teachers were invested in, 
and that teachers actually used; (2) a system of tests that fit the curricula 
and provided timely information in a form appropriate for a variety of 
routine instructional decisions; and (3) a body of te.st information that 
was valid and consistent from classroom to classtw^m and could thus be 
aggregated and compared in . chool and district planning. 

The structure of school decisionptuikin}^. Within the schools, these 
te.st data came into play in two main ways. First, they were routinely 
u.sed hy teams of teachers in regular •^unit" meetings. HIemcntary- 
school **uniis'* included .several teachers (one of whom was chosen as 
umi leader), a cluster of students across two or three grades, and 
occasionally an instructional aide. Students were ofie i divided among 
unit teachers in ditferent groupings for different subjects ba.sed on their 



113 



DiKi:c iiC)Ns \m Pc)i.k \ and Pracmci: 



113 



current level of achievement and rate of learning. (Some schools, how- 
ever, tended to use the self-contained classroom approach in some 
grades.) 

Unit teams met at least weekly during release time at the end of an 
abbreviated school day. At the beginning of the year, tiicy discussed 
students' placement and planned instructional emphases and pacing. 
Later on, they routinely examined students' progress, reviewed their 
placements, re-evaluated and altered their teaching, and discussed indi- 
vidual learner's problems and how best to address them. Data from 
district tests, as well as other available information, were routinely 
examined as these matters were considered. Unit meetings, then, were 
the primary setting for linking test data with instructional 
decisionmaking. (Where classrooms were self-contained, teachers re- 
ported using the district tests individually, as well as in unit meetings. 
And similar procedures were followed in the junior high and high 
school math departments.) 

A second use of district test data occurred periodically as principals 
established school goals and agendas for school in-service activities. 

lyistricf support sysiems. The linkage effort described above was 
supported by the Central School District in a number of ways. 

First, district leaders initiated and provided resources for the curricu- 
luin-and-test development. They also gave relca.se time for weekly unit 
meetings in which the test data were used for instructional planning. 

Second, district administrative leaders provided staff development in 
curriculum writing and test development. Originally, these weekly, 
semcsler-long, courses were led by professors from a state university. 
Later, however, the district encouraged teachers to take over the classes: 
to revise them, make thum more practical and relevant for district staff, 
and then to leach them. Credit on the district's pay scale was given for 
participation in these classes. 

Third, district administrator guaranteed on-going technical assis- 
tance by maintaining close contact with the nearby Intermediate Educa- 
tional Agency (ILA). ILA help was routinely sought on problems in test 
development and oi: scorin^! and-analysis issues. The IHA al.so pro- 
vided some staff development in instruction. 

Lmirth, the district maintained media centers staffed by instructional 
specialists in each school. Specialists helped unit teams and individual 
teachers locate supplementary teaching materials to address learners* 
needs. They also offered training in such areas as instructional diagno- 
.sis and prescription. 

F'ifth, a district administrator worked with teacher committees in 
piloting curriculum units and tests, eliciting teachers' critiques, and 
revising objectives, materials, and test items. 



115 



114 



DlKlXTIONS K)K POIJC Y AND PKACTICh 



It was this same adniiniMralor who encouraged continuing and 
broadening the use of the coniputer-scoring-and-iest-analysis process. 

The Shelter Grove Model 

The Shelter Grove Unified School District is located in the south- 
western region of the country. Until three years ago. Shelter Grove was 
an elementary schocl district. The recent merger with a local secondary 
school district brought Shelter Grove's enrollment to about 5,700. 
These students are distributed through four elementary schools, two 
middle schools (grades 6-8), and a four-year high school. 

Shelter Grove's system for linking testing with instruction is similar 
to Central City's in several ways. Yet it is different enough to be worth 
description as a second **inside-out" model. 

The test information. Like Central City, Shelter Grove administers 
tests of several types. But tho.se that have the greatest power to influ- 
ence instruction in Shelter Grove schools are those developed by the 
district and referenced to its continua (or sequences) of instructional 
objectives in reading, mathematics and writing (composition). 

Shelter Grove initially contracted with a commercial firm which 
promi.scd to write test items for district-selected objectives and to pro- 
vide computer printouts of scores. Introduced in the early I970\s, these 
tests failed to win teacher support. Teachers complained that the tests 
were not coordinated with anything that was taught. They also found 
that they did not know what to do with the results. 

Teacher committees were appointed to try to revise test items. They 
responded to the perceived need to align the coordinating tests with 
their curriculum by beginning to work on a district-level continuum of 
objectives. From then on Shelter Grovels experience paralleled the 
more recen: history of Central City. By the late 1970's, teacher commit- 
tees had devised continua of objectives and accompanying criterion- 
referenced tests for reading and math, as well as similar tests for 
language arts. More recently, a di.strict wr'iing continuum was estab- 
lished. 

Unlike the Central City materials. Shelter Grove\ tests do not serve 
as unit pre-tests or post-tests. And except in written composition, 
district objectives are not accompanied by district-designed materials 
or recommended learning activities. Rathen the continua are aligned 
with the commercial reading and math text series used districtwide. 

The district te.sis were routinely administered to students by class- 
riKMii teachers on two or three occasions between October and February. 
Scores vxere aggregated by the dislrict\s Testing Coordinator for indi- 
vidual students, instructional groups, entire classes, and the school. 



lu; 



DiRixTioNs K)K Policy and Fkac "iic i. 



IIS 



These profiles were sent to the schools in time for planning days that 
occurred regularly at several points through the year. 

In addition, proficiency tests composed of various segments of the 
district's criterion-referenced tests were administered to children in 
grades 4, 5 and 6 each year in April and May in accordance with state 
requirements. 

The structure of school decisionmaking. District tests were routinely 
used in each elemental^ and middle school during planning days that 
occurred at several points in the school year (The system had not yet 
been introduced in the district's high school.) Two of these days were in 
June. On the first, the program of the school was routinely evaluated by 
the entire school staff looking at the group, classroom, and total school 
scores. These sessions functioned as a needs assessment for the next 
school year. On the second June planning day, individual teachers 
placed students in appropriate learning groups for the coming year 

ig the test-result profiles on each student. 

in September of each yean test information was updated: informa- 
tion on students new the district was added. In October, teachers met 
with their principals to set learning goals — benchmarks on the 
continuum that, based upon past performance profiles, they expected 
the children in each instructional group to meet. 

A mid-year evaluation took place each February. Summary reports 
on current-year testing were run, distributed, and examined. Principals 
met with teachers, as well as with the Superintendent and Assistant 
Superintendent for Instruction, to discuss students' progress. Plans for 
modifying the instructional program were made at this time. Then, in 
June, the cycle began anew with reference to the again-updated test- 
score profiles. 

Individual teachers also used criterion-referenced test information in 
reporting to parents each October and again each spring. Report cards 
listed continuum skills on one side and noted students progress toward 
each objective. And each May, letters were sent to the parents of 
children who were two grade levels behind expected performance: 
special conferences with these parents were also arranged. 

District support systems. As was the case in Central City, a number 
of district activities and programs helped to sustain the linking of test 
data with instructional planning in Shelter Grove. In addition to the 
district s leadership and resources in developinji the instructional-ob- 
jectives cv)ntinuua and criterion-referenced tests, these included the 
following. 

First, the district maintained a Professional Development Program 
(PDP) that provided teachers with the skills necessary to act upon the 



ERLC 



117 



116 



DlKL^C riONS K)R POLIC Y AND PRAmCt 



test results. Coordinated by a full-time specialist, the PDP had evolved 
over time based upon the Madeline Hunter orientation to teaching. 
Level One activities (for all new teachers, aides, and substitutes) dealt 
with such basic teaching skills as understanding goals and objectives, 
motivation and reinforcement, and task analysis and diagnosis. Level 
Two activities (which were not required but encouraged, and which 
many teachers joined) extended those of Level One with emphasis on 
individualizing instruction. Strategies for meeting affective needs, 
using inquiry skills, and teaching specific curriculum content were also 
covered. The program required teachers to apply PDP skilLs in their 
own classrooms, with supervision and feedback from the PDP 
coordinator r Prior to the general implementation of this PDP program, 
all principals had been required to take the Level One course plus 
courses in clinical teacher supervision.) 

Second, learning specialists conducted demonstration lessons, rec- 
ommended materials, conducted diagnoses of new students, and assist- 
ed teachers in planning and placement when new criterion-referenced 
test scores arrived in the schools. The learning specialists were consid- 
ered master teachers, and regularly played an important role in helping 
teachers use test information. They also explained changes in the 
continuum or changes in district policy »o the faculty. With the PDP, 
learning specialists were perceived as critical supports to the district's 
linkage effort. 

Third, a Testing Advi.sory Committee composed of a principal and 
several teachers continually updated and improved the district's tests in 
light of teacher criticisms. This group also handled whatever adminis- 
trative and technical problems arose in testing, scoring, and reporting 
results. 

Fourth, (ul hoc continuum revision committees made up of teachers 
and learning speciali.sts were paid during the .summer to revise sections 
of the continua as seemed appropriate. 

In addition to the.sc formal organizational features, a variety of other 
networking activities (e.g., principal observations, learning specialists' 
visits to clas.srooms, monthly meetings of a district communications 
council) helped district personnel work closely together in maintaining 
links between test data and instructional planning in the Shelter Grove 
schools. 

(Guidelines 

The experiences of Central City and Shelter Grove, especially in 
contrast to those of two other districts with similar but less successful 



ERLC 



DlRHC nONS FUR PoLICY ANO PraCTICL 



117 



linkage systems (to be mentioned below), suggest a number of guide- 
lines for other districts to follow in linking testing with instruction from 
the inside out. 

1 . Build currk ulum and assessment measures together "'in-house,'' 
Administrators and teaching staff in both districts believed very 

strongly in the district development process. They fell that it helped 
assure teacher ''ownership'' and confidence in both curricula and tests; 
ownership and confidence, in turn, seemed to be important prerequi- 
sites for teacher use. Shelter Grove's unhappy experience with test^ 
built outside the district, even when they were developed to district 
specifications, supports this wisdom. 

2. Assure a close fit between test items and curricular objectives and 
materials. 

This can best be done by designing curriculum first and then the tests, 
as was done in Central City and, ultimately, in Shelter Grove as well. 

Teachers are inclined to see district objectives-based or criterion- 
referenced tests as a burdensome irrelevancy if this condition is not met. 
New Branford, an urban district with 30,000 enrollment in the 
northeastern United States, attempted to devise criterion-referenced 
tests keyed to its district reading and math objectives. But when Test 
Use in Schools researchers visited New Branford schools, they found 
that few teachers used these tests. Continuum objectives were intended 
to fit with all of the five or six math and reading series used across the 
district. In fact, according to teachers, they fit well with none of them. 
Thus, teachers continued to use the te.sis included with these commer- 
cial series to get the information on achievement they needed — and 
they also had to give district tests to comply with district requirements. 
But information from the latter was rarely consulted, and teachers 
resented the mandate to give them. For similar reasjns. Central City 
teachers neglected their district's object ives-basea reading tests, al- 
though they were generally enthusiastic about those in the other sub- 
jects, developed years earlier with little teacher participation and with- 
out accompanying ciiiTiculum materials. Teachers complained that the 
reading tests were no longer valid for the two basal reading series used 
in Central City. 

3. Strive for nuiximum teacher involvement. 

To help build curriculum and tests that teachers own and use. teach- 
ers* participation in the development process must be more than nomi- 
nal. Both Shelter Grove and Central City included many teachers on 
their development committees; the.se teachers did the real work of 
constructing the curricula (or continua) and the test items. Mechanisms 
were provided that allowed all district teachers to offer feedback on a 



ERLC 



118 



DlRhCTlONS FOR PoUCY AND PRACTIC E 



regular basis. Their criticisms were taken seriously in the revision 
process. 

In contrast. New Branford (mentioned just above) and Metro District 
(another urban district studied by the CSE Test Use Project) had only a 
small number of teachers on district advisory committees as they con- 
structed continua of objectives and accompanying tests. These teachers 
did not participate in the actual development process; their presence 
was not visible to district faculty; they had little impact on the tests that 
evolved. And in neither district did teachers feel the objectives or lests 
were completely suitable. New Branford teachers' response has been 
described. Teachers' response to Metro District's tests was quite mixed. 
4. Construct tests that cover the entire range of skills in the curriculum 
and/or continuum of objectives. 

The district tests of Central City and Shelter Grove included items 
that asses.sed students' performance on skills and content from the most 
elemental to the most advanced in the subject areas tested. Metro 
District (enrollment over 100,(XX)), in contrast, purchased tests for each 
grade level in reading, math, and language arts that covered only the 
simplest skills to be taught. In the economically disadvantaged neigh- 
borhoods where more students had trouble with these skills, test results 
did help teachers identify the skills which individuals and class groups 
needed remediation. But in these schools, the tests also functioned to 
push the actual curriculum in the direction of the most elemental skills. 
Teachers and principals wanted students (and their schools) to do well 
on the tests each spring. Thus, they spent much time drilling and re- 
drilling children on the elemental skills tested. Simultaneously, they 
gave shorter shrift in their teaching to other skills specified for the grade 
level, which were not included on the test. Elsewhere in the district, 
where students routinely obtained 90 percent to 100 percent correct on 
these same tests, they yielded little diagnostic or placement informa- 
tion for teachers. 

One moral of these contrasting stories, then, is test what you want 
teachers to teach, because teachers will place their teaching emphasis 
on what you test. 

Several other '^do's'^ and ^'don'ts'^ can be abstracted from the Cen- 
tral City. Shelter Grove, and similar but less succes.sful models. These, 
however are equally pertinent to the '\)utsidc-in" linkage approach 
d'scussed next. Thus, they will be omitted here and mentioned in the 
concluding .summary. 



ICO 



DiKi.cnioris I'OK Poi.u^v and Practic i; 



119 



Building Links From the Outside In 

Districts that follow this approach adapt int'ormalion from externally 
mandated tests to suit the district's and/oi schools' planning needs. In 
so doing, they support school-level planning structures and procedures, 
just as districts taking the inside-out path do. 

The testing-instruction linkage systems of two districts that followed 
the outside in approach are described below. They provide very differ- 
ent, but equally instructive models. 

The St. John Model 

The Si. John School District covers a wide geographic area of subur- 
ban and semi-rural municipalities in a Western state. Its 72 schools 
serve between 40 and 50 thousand students in grades K-12, 

Linking testing with instructional planning began in St. John during 
the mid-1970\s when the state legislature enacted a program intended to 
stimulate local planning for school improvement Participation in the 
program was voluntary, but over the years most of St. John's 
elementary schools, along with two of its junior high .schools and one 
high .school, elected to participate. The district encouraged this involve- 
ment: in turn, the schools' participation stimulated district efforts to 
provide test data for use in local site planning. 

The tesi information. Long before the advent of the slate-sponsored 
school improvement program, St. John School District had required 
admini.stralion of the Iowa Test of Basic Skills. Students were tested 
each January in grades 2-6. The purposes this information had served 
previously are not germane here. But once numerous St. John schools 
ji>ined the state program, test data became especially important for 
tliLMn. (iuidelines for the slate school-improvement planning process 
required that in establishing improvement plans schools specify: ( 1 ) the 
**exisling level of performance** in a particular area, (2) the "needed 
program changes or additions,** improvement objectives, and (4) 
activities to measure these objectives. Major activities to be undertaken 
in pursuit of each i>bjective also had to be described, along with budgets 
and other improvement program teatures. But the four requirements 
enumerated here were those that called for **hard data * such as te.st 
results. 

It seemed reasonable to use I TBS results in developmg these im- 
provement plans, yet district administrators reali/.ed that the.se results 
came back Irom the lest publisher in a form that was cumbersome. 



ERLC 



121 



120 



DiRiirnoNs r)r Poi.ic^y and Pracmici*. 



Computer printouts presented the results for each sub-test area tor each 
grade for each year on a separate page. Principals and teachers found 
these reports complicated as well as overwhelming in volume. Conse- 
quently, the district undertook development of what ii now calls tne 
Academic Performance Profile (APP). 

The APP gave each district elementary school an annual overview of 
its ITBS test results for all years and all grades for a particular subtest 
(e.g., reading comprehension, math concepts, etc.) on a single page. 
This reduced fifty pages of computer printout to approximately six, 
ordinary 8 by 1 1 inch pages. 

In addition, the APP simplified the format in which the information 
appeared. Simple graphs were devised to visually display : (1) the 
scores of student groups as they moved through the grades (1982 tlrst 
graders us second graders in 1983, etc.); (2) the perfcrmarwC at various 
grade levels in various years (the fourth grade in 1981, 1982, 1983, 
etc); and v3) the gains (indicated in terms of grade-level growth) 
realized irom one year to ihe next for the various grade levels (the gains 
made by the 1982 second-grade group as third graders in 1983). Two 
simple tables on each page uhat is, for each sub-test) supplemented the 
three-line graphs. 

Since the stale program guidelines also called for annual needs as- 
sessment, the St. John District created survey questionnaires for sta*l, 
parents, and .students. These solicited respondents' perceptions of: (I) 
the effectiveness of schools' various programs; and (2) how much 
attention should be given to improvement in each program area, tach 
school could add up to 20 questions to the set used in common across 
the di.strict. Surveys were administered annually in the spring of each 
year The district's evaluation office tabulated survey results for each 
school and returned them in a concise form. 

The striu'tiire <^f sch(H>l di\ isionnuikin}i. The state's school improve- 
ment program mandated the creation of a School Planning Council 
(SPC) in each participating school. Guidelines directed that the SPC 
membership include the principal and elected representatives of the 
teachers, of other school staff, of parents and other community mem- 
bers, and (at the secondary level) ot the student body. This group was 
assigned central respon.sibility for establishing needs, goals, and activi- 
ties for school improvement, as well as for budgeting the state funds 
provided to the school tor improvement activities. 

St. John's district evaluation sp<:ciali.sts, however, elaborated on 
these state requirement.s. They urged their schools to also create "com- 
ponent committees, ^' smaller groups (including SPC members and 
ofhcrs) who were charged with planning for improvement in particular 



^ 1C2 

ERIC 



DikiXTioNS i c)R Polic y and PkAcrich 



121 



areas — in each subject area, in school environment, in human relation, 
in staff development, etc. 

Component committees reviewed the ITBS/APP summary forms, 
survey results, and other 'nformation. They specified and documented 
needs, set objectives, and developed school and classroom activities to 
realize them. They also stated how achievement of the objectives would 
be evaluated and proposed a budget suitable for their plan. In a next 
step, various component committees presented their particular plans to 
the School Planning Council. The SPC accepted or suggested changes 
in each improvement-plan component and made decisions regarding 
final allocation of state program dollars among the various components. 
The SPC also monitored implementation of the plan through the com- 
ing school year. 

While plans were routinely developed for a three-year period, revi- 
sions were made each spring ba.sed on information gathered during the 
current school year. Thus, school impro* >ment planning was an annual 
process centered in the spring, but implementation of plans and SPC 
monitoring occurred continuously during each school year. 

Interviews with participants and observation of planning meetings 
indicated thai test data (and survey results) were used in deciding upon 
and substantiating needs, specifying objectives, evaluating implemen- 
tation, and revising the plans. SPC members also routinely referred to 
this information in making and justifying budgetary decisions. 

District support systems. The St. John School District supported its 
testing-instruction linkage system in many of the same ways that Shel- 
ter Grove and Central City supported their quite different models. 

First, staff development in the organization and process of planning, 
including the use of the APP test summaries, was conducted for 600 
district personnel during their first year in the state program. Others 
received this introductory training as they entered the program. Further- 
more, teachers, principals, and parents agreed that the regular availabil- 
ity of the districts' two evaluation specialists was a key to the program's 
mainienance. They routinely provided staff development and answered 
iul hoc questions regarding planning and test-data use. 

Second, St. John maintained a comprehensive staff-development 
program in instructional techniques, which everyone agreed was a 
major factor in facilitating the realization of school plans. 

Thv Bayview Model 

Bayview is a community of 100, OOO, and is located about 50 miles 
from a major V/estern metropolitan area. The Bayview Unified School 



ERLC 



123 



122 



Directions for Polk y and Practic* -. 



District's sixteen elementary schools, four junior highs, and three 
senior highs enroll 14,000 students. 

Bayview's six-year-old effort at testing-instructional linkage was 
more diffuse than that in most of the other school districts visited by 
CSE researchers. Interest in testing and evaluation was relatively new, 
and many in the district were as yet skeptical of their value. Nonethe- 
less, the need to comply with externally mandated testing programs 
stimulated a small group of district administrators to iry to make greater 
local use of the test scores they yielded. Only one of these uses will be 
discussed here It offers an example of ''outside in" testing-instruction 
linkage that is quite different from the St, John School District's model. 

The test information. Three different achievement testing programs 
figured in the Bay view linkage system are described here. The first of 
these was the State Assessment Program (SAP). This half-hour test was 
administered each spring to students in grades 3, 6, and 10 in accord 
with state requirements. The test was devised by the state and 
nM'crcnccd to objectives common to many state-approved text series. 
Items were matrix sampled; not every student was asked to respond to 
identical questions. Thus, data for individual students were not report- 
ed. Results focused on grade level and school patterns, 

A second test used by Bay view was the norm-referenced, standard- 
ized Comprehensive Tests of Basic Skills (CTBS), The district had just 
begun to require this test in all schools for grades 1-9 when CSE 
field work was conducted. Formerly, it had been given only in schools 
with Title I (now Chapter 1) compensatory education programs. 

The district\s proficiency (or minimum competency) testing program 
was also used in testing-instruction linkage. Forms for grades 5, 9, 10, 
;md 1 1 had been developed with the help of consultants to meet the 
state's mandate. These measures covered reading, writing, and math- 
ematics skills deemed essential for 'Mifc coping,'' The current forms of 
the test were introduced in 1978, 

The decision-makinii stnu ture. The data from these three tests were 
broui^.lit to bear on instructional planning in several ways by Bayview 
di.strict leaders. Chietly. however, they had begun to use the three test 
programs mentioned above as content for staff development course 
work in task analysis and diagnostic-prescriptive teaching. 

District leaders had wi>n grant funds from the state to create a Profes- 
sional Development Center (PDC). The primary focus of the PDC's 
program was the continuing development of effective teaching strate- 
gies, A Teacher Center funded by a federal grant augmented the PDC. 
Curriculum devclopmcMit and the translation of educational research for 
practical, instructional applications were the central thrusts of the 



ERLC 



124 



DiKEcnoNs H)R Policy and Pkaci icl 



123 



Teacher Center's program. The very presence of these two centers 
testified to Bay view s emphasis on teaching-effectiveness skills. In 
addition, principals were required to attend workshops dealing with 
supervision, and these focused on the elements of effective teaching. 

It was in the context of increasing external test mandates and the 
emphasis on staff development that Bay view's linkage system began to 
take shape. From the perspective of District leaders. Bay view teachers 
and principals were not facing the issues raised by the District's rela- 
tively poor performance on the external measures. In response, said the 
Director of Staff Development: 

Wc |at the central otficej tried to n.odel a problci .-solving way of 
li/oking at it so principals could do similarly in their schiK)ls. The Direc- 
tor of Instruction worked with principals in the way he wanted them to 
work with teachers. Also, we asked teachers if they were ad^iicssing 
areas of the lest. They said th'w were. When we observed, we found 
teachers had difficulty defining the skills to be tau^^hi as well as diagnos- 
ing for the^c skills. As a result, we built task analysis cyc'es : U) our 
Professional Development Center programs focusing on tne low scoring 
skill areas identified by the State Assessment Program. 

The district's cadre of leaders began by training principals to exam- 
ine SAP (and later the other tests mentioned earlier) to see what specific 
skills they assessed. Once these were identified, the next step was for 
principals and faculties to examine their school's curricula in order to 
determine whether these skills were being taught and if so at what 
grades and with what emphasis. Staff development provided principals, 
and later teachers, with the information and techniques they needed to 
do this. 

This was taking place with varying degrees of thoroughness in differ- 
ent Bayview schools v\hen CSE staff members visited the district. At 
the same time, areas of curricular and instructional weakness 
di.strictwide had been identified by di.strict admini.strators. The.se areas 
were then targeted for sessioi^s on diagnostic-prescriptive teaching and 
other instructional skills. 

Analysis of test results also suggested areas for emphasis in the 
development of continua. Citing the impact of proficiency-test skill and 
score analysis, for example, the Bayview Coordinator of Curriculum 
said; 

The proHciency exam has helped the district focus on curriculum, , . 
I We learned ihal| in malh we teach computation but the test tests applica- 
tions through slo'A problems. 



125 



124 



DlRBCnONS FOR PoLIC Y AND PRACIICli 



Thus, in the Bayvicw Unified School District, task analysis of tested 
skills served as the basis for a comprehensive examination of the dis- 
trict's curricula and suggested areas of curricular weakness. Simulta- 
neously, analysis of tesi results led to the identification of teaching 
weakiJsses. Links between testing and instruction were generated 
through the development of district-wide objectives and in Professional 
Development Center and Teacher Center programs. 

Guidelines 

The St. John and Bayview districts had put in place very different 
kinds of systems for linking the results of externally mandated testing 
with instructional planning in their schools. Nevertheless, it is possible 
to abstract a number of guidelines from their ''outside-in'' models. 
Other districts would be well advised to bear these in mind should they 
chose to follow the outside-in approach. 

1 . Make iest-score data comprehensible for teachers and principals. 
Providing test results in a format that facilitates their use is obviously 

a key to testing-in.struction linkage. That professional educators work- 
ing in the schools can be bewildered and intimidated by reports of 
scores from externally mandated measures was clear in Test Use in 
Schools Study fieldwork (cited early on in this paper). It was equally 
apparent in the early experiences of district administrators in both 
Bayview and St. John. The latter addres.sed this problem by translating 
the scores into succinct, easy-to-read, and relevant tables and graphs. 
Bayview dealt with it by teaching principals and teachers to dissect the 
tests and test results. 

2. Train teachers and principals to use test scores as diagnostic tools. 
As noted earlien the results of externally mandated tests are com- 
monly used in a brief and casual way to get a general comparative 
reading on group performance. The es.sence of their use in the St. John 
and Bayview systems was diagnostic. They played a role in identifying 
patterns of strength and weakness in particular content areas and skills. 
They served to stimulate questions such as **Why are we scoring as we 
are scoring in this curriculum area?" and "How can we improve?'* 
Diagnostic uses are not routine in most schools. Simply presenting test 
scores in clean readable format does not mean that diagnosis of curricu- 
lar strengths and weaknesses will occur. Teachers need instruction and 
practice in anaiy/ing the different factors that underlie test perfor- 
man;.e. They need instruction and help in abstracting meaning from 
scores. Survey findings suggest that most districts do not provide this. 
In different ways, both St. John and Bayview did. 



DiKtrnoNs ior Polic y and PKAcric b 



125 



3. Expect that results of externally mandated tests will serve as only one 
source of information in planning and decuyion-making. 

Wisely, neither Bayview's cadre of leaders nor Si. John's district 
evaluation specialists tried to make test results the sole basis for educa- 
tional decisions. Human values and priorities do and should influence 
decisions about what objectives to pursue in school improvement or to 
build into district continua. The day-to-day experiences with students 
thai teachers and principals rely upon so heavily are very relevant in 
making instn jtional decisions. These factors were routinely accepted, 
along with test data, as bases for decision-making by St. John adminis- 
trators as they assisted School Planning Councils and reviewed their 
plans. Bayvicw's Coordinator of Staff Development, too recognized 
that test data needed to be examined in light of other factors as he 
explained, ''When we see through our task analysis and curriculum 
review what we are and are not teaching, the next step is to ask, ''Do we 
or don't we want to teach this? How important is it for our students?'' 

Data from externally mandated tests can serve to identify problems, 
to support or disconfirm experience-based judgments, and to stimulate 
questions. It can be used to Justify or rationalize decisions that have 
already been made. But as the separate experiences of St. John (recall 
their needs assessment questionnaires) and Bayview (recall their juxta- 
position of multiple measures to district curricula) indicate, test data in 
themselves arc only one important source of information for education- 
al planning. 

Summary and Conclusions 

CSH\s national survey and its fieldwork in two research projects 
suggest that both testing that is internal to the school and that which is 
externally mandated can be used more fully in systematic educational 
decisionmaking. Districts can build a curriculum and tests that can 
.serve teachers' routine classroom needs ami simultaneously provide 
consistent, reliable, and valid data for school and district planning. 
Districts can also capitalize upon data from externally mandated testing 
by adapting it to local needs. No single approach or model will be 
appropriate to every setting. But whether a district chmvses to pursue 
linkage from the inside out or from the outside in. there are several 
factors that seem necessary for success. 

One of these is district leadership. In each district studied by CSH. 
there was an individual or a small group in the district office idea 
champions and supporters — who were vitally interested in using test 
data in instructional planning and decisionmaking. CSli\ national test 



127 



126 



Directions for Polic y and Practicl: 



use survey substantiates that such leaders make a difference in school- 
level uses of test information. 

A second element in district success is an organizational arrange- 
ment — a setting and set of procedures — for decision-making. In 
Central City schools there were the weekly meetings of unit teams; in 
St. John, regular sessions of the School Planning Councils. Shelter 
Grove held its principal-teacher planning days in June, October, and 
February each year. In Bayview, the locus oflinkage was staff develop- 
ment workshops, continuum-building committees, and regular school 
faculty meetings. These organizational arrangements motivated and 
structured the use of test results by creating ( 1 ) real needs for informa- 
tion, and (2) procedures by which the implications of test-score patterns 
could be discussed and acted upon. None of the districts with successful 
linkage systems simply offered schools test data and left their use to 
chance. 

Third, each of the districts managed testing and/or test results such 
that they increased the marginal utility of test information for teachers 
and principals. Teachers routinely receive data on student achievement 
as they watch their students in class, review their assignments, and 
grade classroom tests. These data are immediate, rich, and compelling. 
So loo is the information principals regularly gather as they talk with 
staff and visit their classrooms. To be as useful and as compelling, 
external test information must add ''.something new" to what teachers 
and principals already know. Bach of the four models described above 
did this. Central City's computer*scoring-and-analysis system for unit 
tests summarized individual students' mastery of objectives, as well as 
their errors and weaknesses. Shelter Grove compiled data on the 
progress of individuals and instructional groupings toward benchmark 
goals. St. John's Academic Performance Profiles charted year-to-year 
irends and annual gains. Bay view's task analysis projects, based on 
tested skills and test scores, helped to reveal why and how students' 
performance came to be as it was. In each case, test data was configured 
in ways that told teachers and principals something more than ''your 
students aie doing well in this and not .so well in that" — which is 
information teachers and principals typically feel they already have. 

A fourth and final element in successful district linkage is the mainte- 
nance of on-going resource and support systems. In the districts stud- 
ied, these centered in the area of .staff development: training in test 
development and u.se, training in how to realize instructional goals 
derived from test information, or both. Frequently, too, instructional 
support staff — learning specialists, media specialists, evaluation spe- 
cialists — were routinely available to provide help and answer ques- 
tions. Support al.so took the form of adaptability and * a. nlity on the 



ERLC 



1 




Directions ior Polic y and Practice 



127 



pari of district administrators. Clear channels were open tor Central 
City and Shelter Grove teachers to participate in the development of, 
and to criticize the quality of district curriculum and tests. St. John's 
evaluation specialists revised district needs-assessment surveys in light 
of teachers' feedback; local .schools could add survey items suitable to 
their particular concerns. Bayview district leaders showed patience and 
understanding in encouraging principals and teachers to take a "prob- 
lem-solving approach" to low test scores. And of cour.se. each district 
supported its testing instructional linkage system with relea.se time and 
other resources. 

The models and guidelines suggested here will not answer all the 
questions and concerns school districts will encounter as they work 
.systematically to link testing and instruction in an on-going process of 
.school renewal But they do indicate productive paths toward the more 
efficient use of testing and the improvement of educational planning in 
American schools. 



12^ 



REFERENCES 



Airasian. P.W. (1979). The effects of standardized testing and test information on 
teachers' perceptions and practices. Paper presented at the annual meeling of the 
American Hducational Research Association, San Francisco. 

Baker. H.L. { 1978). Is something better than nothinfi? Metaphysical test design. Paper 
presented at the 197H CSH Measurement and Methodology Conference. Los 
Angeles. 

Bank. A.. & Williams, R.C. ( 198 la). Evaluation design project: Organizational study 
(Annual Report, 1980^1981). Los Angeles: UCLACenlerfcr the Study of Evalua- 
tion. 

Bank, A.. & Williams, R.C. ( 1981b). Evaluation in sthool districts: Organizational 

perspectives, CSE Monograph Number 10. Los Angeles: UCLA Center for the 

Study of Evaluation, 1981. 
Bank. A.. & Williams. R.C. (1983). Assessing the costs and impacts of managing 

VE/I Systems: A collection of nine papc s. Los Angeles: UCLA Center for the 

Study of Evaluation. 

Bernian. P. & McLaughlin. M.W. (1977). Federal programs supporting educational 
change. VoL III: Implementing and sustaining innovations, (R-1589/K-HEW). 
Santa Monica. CA: Rand Corporation. 
Boyd J.. Jacobsen. K., McKenna. B.f • . Stake. R.E.. & Yashinsky. J. (1975). A study 
of testing practices in the Royal Oak (Michigan) Public Schools, Royal Oak 
Michigan: Royal Oak Public .,.,hcx)ls. 
Burden. J.L. (1982). Teacher certification. In H.E. Mitzel (Ed.), Encyclopedia of 

educational research (5th ed.). New York; The Free Press. 
Burry. J.. Catterall. J.. Choppin. B.. cS: LX)rr-Bremme. D. (1982). Testing in the 
nation s schools and districts: How much:' What kinds? To what ends? A- what 
co\ts:* CSE Report No. 194. Los Angeles: UCLA Center for the Study of Evalua- 



Center lor the Study of Evaluat ion ( 1 979). CSE criterion-referenced test handbook, Los 

Angeles: I'CLA Center for the Study of Evaluation. 
Choppm. B.. Dorr-Breniine. D.W.. & Burry, J. (198!) Test use project annual report, 

Los Angeles: UCLA Center for the Study of Evaluation. 
CicourcL A. V. ( 1974). Cognitive sociology: Language and meaning in social interact 

tK»n. New York: The hVeo Press. 
Cicourcl. A. v.. & Kitsusc. J I (I96.M. I he educational decisionmakers. Indianapolis 

Bobbs-Merrill. 

C\)ffnian. W. { 198.^). Testing in the schools: A historical perspective. In E.L. Baker & 
J.L. Herman (lids. ). Testing in the nation s schools: Collected papers {pp. 3-27). 
l.os .^ngeles: UCLA Center for the Study nf Evaluation. 

Cronhach. I, .J. (I96.M. Course improvement through evaluation Teachers College 
Reci^rd. M. 672-68.V 

Davison. J I-. (1977). Why do demonstration pxo\c\:\s} Anthropolo[i\ and Education 
Quonerlv, <V(2). 9,S.|0.S. 

Deal. I.E. il979). EmLigc and intornuition h.sc in educatumal organizations. Paper 
presented al the annual meeting o{ the American Educational Research A.s.soci- 
alion, San Erancisco. CA. 

Dnrr-Brcmnic. D.W. ( I9H.M. Assessmi! students: ll-achers* routine practices and rea- 
soning. Evaluation Comment. r'u4). 1-12. 



Hon. 




130 



RlilHRLNCtiS 



bboL R L. ( l%7). Improving ihc compeienLV ot icachors in educational mcasurcmcni. 
J. Hlynn & H. Garbcr (Eds. ). Axsessin^ behavior: Readmits in educaiional and 

fS\rholo}iical measurement. Reading, MA: Addison-Wesley. 
lidhionds. R. (1979). Effective .sehools for the urban poor. Educaiional Leadership, 

,^7(2). 15-22. 

l-leming, M. , & Chambers. B. ( 1 9S3), Teaeher^-made tests: Windows on the classroom. 

In W. Hathaway (Ed, ), New diret lions for testing and measurement' Teslin^ in the 

sihrnls (pp. 29-38). San Francisco: Josscy-Bass. 
(iartlnke.. H. ( 1967) Studies in ethnomethodology. En^'l^»wood Clills, N,J,: Prenliee- 

K»ll, 

Gosiin. f.VA. (1965 The use of standardized ahiliry tesls in American secondary 
schools and their impact on students, teachers, and administrators. New York; 
R'»ssell Sage R)undation. 

Gosiin. D.A., Epstein, R., & Hallock, B.A, (1965). The use of standardize ' tests in 
elementary schools. Seco^'l Teclinieal Report, New York; Russell Sage founda- 
tion. 

Maron Instiliuc. ( I97K). Summary of the Spring Conference of tf National Consortium 

on Testing. Cambridge, Mass.: Huron Institute. 
La/ar Morrison, C. h)lin. L.. Moy. R.. & Burry. J, (1980), A reviewofthe llterawre 

on test ii.w. CSE Repiut No. 144. Los Angclc^ UCLA Center for the Sludv of 

E>alualion. 

Lmn. R.L. ( 1983a). Curriculum validity; Convincing the couris it was taught without 
precluding the p^issibility of measuring it. In G.F. Madaus (Ed.), The courts, 
valtdity, and minimum competency testing, Hinghii'.n, Juwer-Nijhoff 

Lmn, R.L. (l9S3b). Testing and instruction; Lir.Ksand di.stinct , Journal of hduca- 
tumal MeaMtremem, 20, 179-189. 

Madaus. C\.\ ., & McDonough. J. (1979). Minimum competency tr.iing: Ur --amincd 
assumptions and unexplored negative iiulcomes. In R. Lennon (Ed ), Impact 
i honi^r.^ //I , :,"(« •trcment: Newdirectitmsfot testini^and measurement. ,?(,^), 1-14. 

M .yo, (1959). Testing and the useof tesi results. Review oj Educational Research. 
2V(1), 5-14. 

Mchan. .L. ^ Wih^o. H. (1975). The realit\ of cthnomethodology New York; Wilev 
Inicrseiencc. 

Mcvcr. J.W.. ^ Rowan. B (I97S). The structure of educaliimal organizations. I.i 
M.W. Mcser and assvK iales (Eds ). Envinmment and or^amzatums. San F^'ancis- 
co: Josscv-Bass. 

Monijov. R.S.. O'Toolc. Jr . L.K ( 1979) Toward a iheor> of policy implemenianon. 

Puhlic Administratnm Rcvww. .^9(5). 
Pcrn^nc, V. (1978). Remarks to the Natumal Conferenee on Achievement Te.stm}^ an.l 

Hasu ,Skills. Paper presenicd al the National Conference on Achievement resitni* 

and Basic Skills. Washingu^n. D.C. 
Purke\. S.C.. & Smilh. NLS. ( 19S2). Effective schools — A review. Paper prepared for 

liic national mviialumal conference. Research Teaching: Inipli alions fur Prac- 

iKe. Airlec Hmise. Warrenion. VA. 
Rcxiiuk. L.B. (1981). lniri)dueHon: Research lo inlorm a debate. Ph Delta Kappan, 

(0(9). 62.^-625. 

Rudmafi. H . Kelly. J.L.. Wanous. D.S. . Mehrens. W. A. . Clark. M. . <S: Porter. A.C 
( 1980) hitei^ratin}^ assessment with in.ytrui turn. A review 1^22 IWO. l:ast Lan- 
sing, Ml: Institute lor Research i>n Teaching 

Salnion-Ci^v, L. (1981). Teachers a?id tests: What's realls happening? Phi Delta 
Kappan. 69(9). 6.M '6.U. 



ERIC 




RBPliRHNCliS 



131 



Schul/. A. (1962). Col UrU'd papers I: The problem of social :eali(y. The Hague: 
Marlinus Nijhoff. 

Slctz, K, & Beck, M. (1979). Comments from (he classroom: Teachers' and students' 
opinions of achievement tests. Paper presented at the annual meeting of the 
American Bducational Research A.ssociation, San Francisco, CA. 
Stinnett. TM. (1969). Teacher certification. In R.L. Ebel (Ed.), encyclopedia of 

educational research (4th Eid.). New York: Macmillan. 
Tinkleman, S.N. il966). Regents examinations in New York Stale after l(X) years. 
Proceedings of the Invitational Conference on Testinfi Problems. Princeton, NJ: 
Educational Testing Service- 
Tyler, R. ( 1977). What's wrong with standardized testing? Today s Education. 66(2), 

Weather^. K.. & Lipsky, M. (1977). Street-level bureaucrats and institutional innova- 
tion. Harvard Educutioncl Review, 47(2), 

Weider. D.L. (1973). Lunguaie and social reality. The Hague: Moiiton. 

WiH;llner, R.S. (1979). Ixt's use tests for teaching: Standdrui/ed test result.*" can 
provide th. basis for a program of instruction. Teacher. 90(1). 62-64, I79.|8I . 

Wood, H. (1968). The lahellirg process on a mental hospital ward. Unpublished 
master's thesis. University of California, Santa Barbara. 

Yeh, J.P (1978). leyt use in schools. Los Angeles: UCLA Center for Study of Evalua- 
tion. 

Vch. J.P (1980). Reanalysis of data. In D.W. Dorr-Bremne (Ed.), Te.st u.sc project 
annual report [Wo]. II). Los Angeles: UCLA Center fo* thv Study of Evaluation. 

Yeh, J.P, Herman. J. L., & Rudner. L.M. (1981). Teachers c nd ie.s'ting: A .survey of test 
u.sc. CSE Report No. 166. Los Angeles: UCLA Center for the Study of Evalua- 
tion. 




ERLC 



